About the Science Studio Project

Experiment Management

Generally, scientists encounter or are given a problem that requires a solution, or in other words, a question that needs to be answered. Typically, scientists generate samples in order to aid in the search for a solution to the problem. And in most cases, these samples will be acted upon or analysed by more than one technique, which may then require access to more than one device. And furthermore, these devices may be situated at different facilities. Thus, all aspects related to the scientific problem or question, are best organized under one ‘umbrella’ which in Science Studio is the ‘Project’ umbrella.

The goal of the “Science Studio” project is to create a complete experiment management system that will allow researchers to control and observe, from their home base, all aspects of research that must be carried out at specialized laboratories throughout Canada. The successful implementation of this experiment management system will result in vastly improved access to these expensive laboratories by scientists and as such, increased research productivity, increased opportunities for collaboration and heightened awareness of the potential value of these laboratories among all researchers and the general populace.

The project is very innovative because the software is designed, for the first time, to manage all aspects of the experiment process from sample selection and scheduling, to control and observation of the experiment, to collaborative review of the data and its significance. Further, “Science Studio” is an experiment management system that organises and records all aspects of the experiment process including: initial sample characterisation, measurement and labeling, shipping to one or more facilities, repository of all raw and processed data and experimental conditions, discussions of the results with collaborators, and additional ideas resulting from the wider interaction with new viewers of the experiment. The system allows both easy viewing and manipulation of the experiment and data analysis. Such a usage centric user experience does not yet exist.

Scientific Collaboration

Teams working on scientific projects are frequently multi disciplinary and international, with team members, spread across the globe. One of the strengths of Science Studio is to allow team members to view the setup and running of an experiment from their home institutions, no matter where they are located. Team members will either be designated as ‘experimenters’ or ‘observers’ depending on their role. ‘Experimenters’ will have permission to run the device remotely online, whereas ‘observers’ will have permission only to observe, and also to collaborate. Only one experimenter at a time will be allowed to have control of the device, however, control may be handed over to another experimenter on the team. All team members will have the same view of the sample, can test scan areas of interest on the sample, discuss the resulting raw data and determine the final scan area and optimum parameters for analysis.

This real-time experiment collaboration will allow, for example, industrial partners to observe and contribute their expertise during the experiment process. As well, academic supervisors can mentor and advise their students during the experiment process. Future enhancements to the collaborative aspects of Science Studio will include the ability to link related reports and scientific papers into Science Studio. Team members would be able to log into Science Studio and call up a document related to a project, work on the document as a team, edit and pass the report back and forth until a completed version is agreed upon.

High Speed Data Processing and Storage

ANISE Project: Active Network Interchange for Scientific Experimentation

Synchrotrons are large electron storage rings that produce x ray beams of unprecedented brightness. These beams can be used to provide critical information on a range of materials whose integrity underlies much of our industrial technology. Synchrotrons have the capacity to produce bundles of individual scientific results within a fraction of a second, yet today most data is not fully assessed for days or months following an experiment. Thus such information is not readily integrated into the logic behind any subsequent experiment, nor is it rapidly available to those experts who might depend on a result in order to make a critical decision. Access to a typical synchrotron-based experiment costs supporting governments is over $5000 per hour and present-day individual experiments may consume many hours. Moreover, access on demand is quite limited because of the high backlog of users. Thus, while there have been massive investments in the physical hardware in support of the advancement of science, the software/hardware solutions for delivering that science to users has not reached the potential of which it is capable.

What is required is a process that would rapidly and completely analyze the output from an experiment which would feed back intelligent choices for subsequent experimental steps based on inputs from the experts and data bases, as well as from patterns learned from previous experiments. Each individual experimental feedback segment might thus be as brief as a few seconds; this would massively reduce the time spent collecting data that is irrelevant to the problem at hand. We propose that such a capacity could be established on a high speed network platform called ANISE, accessible by users worldwide and capable of processing the results from more than one synchrotron using stream computing. On completion, ANISE would be capable of providing near-real time analysis of data from selected synchrotrons based in Canada and the US. Initially, experiments involving x-ray diffraction, tomography and fluorescence would be targeted: all are capable of producing useful experimental bundles of data in sub-second segments with expected data rates as high as 500 megabytes per second. All of these techniques are of great immediate or potential interest to thousands of academic and industrial researchers worldwide in earth, materials, health and environmental sciences.

Software for such a process could be developed for individual experiments at a single synchrotron; however, prudent resource management suggests that a high speed network employing stream computing could process data from different types of experiments at several international synchrotrons for a large number of users and potential users. It is proposed that ANISE would serve such a function. Once completed, ANISE would be capable of providing near-real time analysis of data from synchrotrons based in Canada and the US. Initially, experiments involving x ray diffraction and fluorescence would be targeted: both are capable of producing useful experimental bundles of data in sub-second segments with required processing rates as high as 10 megabytes per second for near real time operation. Both techniques are of great immediate or potential interest to thousands of researchers worldwide in earth, materials, health and environmental sciences.

Parallel processing is well suited to the type of signal processing that is necessary to filter and shape rapidly arriving signals from the above types of experiments. In the case of a CCD image for a Laue diffraction measurement, it would be desirable to assess the noise character and look for signals that are detectable above that noise before further assessment of the image. The flow chart in the following figure shows an example of how an incoming CCD signal might be treated in steps leading up to the indexing of one or more diffraction spot patterns and the identification of deviations in that pattern that could be identified as mechanical strain. In sequence, incoming signals are enhanced and sorted as belonging to: (a) an undecipherable signal, (b) a signal arising from an amorphous region, (c) a well-ordered pattern, (d) more than one patterns and (e) patterns distorted by the presence of dislocations of varying density and direction. In addition, there needs to be a more fundamental way for recognizing and treating patterns from all crystal systems, not just from the one system targeted in the investigation; this could possibly be accomplished through the use of Hough transforms. The indexing software originally designed and written by Oak Ridge National Laboratory (ORNL) scientists has a capability for undertaking the more specialized algorithms needed to index diffraction patterns; what is needed is the introduction of a “front end” signal processing routines, many of which are already well-researched, that will provide many more categories for the images, each with specialized instructions for subsequent processing. Thus, newly available System S software and cell processing hardware should provide a bridge to unlock the copious information available from experiments from advanced synchrotrons.

Figure 1
Figure 1: Decision Tree for Laue Diffraction

ANISE processing of XRF measurements will also be required. An improved search method for detecting trace elements in a solid would be developed using a “grid casting method” to detect coincidences in signal from two or more adjacent nodes in an area grid that is searched prior to a more detailed search of the regions wherein the element has been detected. As well, advanced FFT noise filtering methods would be tested both on the regular Si detector and on the CCD detector. Finally, a capability for XRF and XRF tomography would be introduced to VESPERS later in the project; this would require some changes in stage movement and adaptation of software under development at ORNL.

Overall, this project would open synchrotrons to become much more interactive with the outside world. The present paradigm where data gathering and analysis are largely isolated functions will be changed to one where most data is analyzed as it is gathered. The results will then be of greater use to the experimenter and to others who have a stake in the experiment. This will increase interest and investment in synchrotrons by a broader range of groups and sectors in our society.