April 2, 2004  
 

 

Networking: The Next Generation

By Kate Metropolis

Rapid, reliable data transport is essential for global scientific collaborations to gain new knowledge. SLAC was recently honored for its contributions to highspeed networked computing.

Congratulations from U.S. Congresswoman Anna Eshoo. (Image Courtesy of Les Cottrell)

Data are nature’s gift to scientists, but the gift can at times feel like a curse. To address some key questions in their fields, researchers at the frontiers of particle physics, astronomy, bioinformatics, global climate modeling and seismology need to harvest, store, share and analyze staggering quantities of information.

Today, the
BABAR collaboration ships about a terabyte of data every 24 hours between SLAC and computing centers in France, Italy, and England—the equivalent of a thousand copies of the Encyclopedia Britannica a day. To continue to make scientific breakthroughs, their dataset needs to roughly double every year.

By the year 2010, when a new generation of experiments will be underway, the high energy physics community anticipates that datasets will reach 100 petabytes, the digital equivalent of 20,000 times all the text in the Library of Congress. Thousands of physicists and students at institutions around the world will want to access and process them using petaflops of distributed computing. (A petabyte is 1 billion megabytes; a petaflop is 1 million billion computations per second.)

The computing tools to provide these capabilities are being developed by the leaders in high-performance computing at SLAC and their colleagues around the world.

Teams of physicists from Caltech, SLAC, LANL, CERN, Fermilab, Florida International University, University of Florida, University of Michigan, BNL, and the MIT Haystack Observatory are focused on optimizing the use of the Grid for data-intensive science. The Information Grid is analogous to the electric power grid. Just as you toast your bagel in blissful ignorance of which power generators your toaster is drawing electricity from, you’ll analyze data using a geographically distributed network of computational resources, without ever knowing where they are.

The collaboration, which also includes experts from U.S. industry, is building a permanent facility for testing, tuning and using applications from physics and astronomy that require reliable data transfers at rates of up to 10 gigabits per second.

The partnership is named UltraLight because data are transmitted as pulses of light. A single optical fiber can transmit information at 10 gigabits per second using one frequency. By using different wavelengths, that same fiber can carry more than 100 different signals simultaneously.

Networking knowledge, experience and innovations from the UltraLight partnership will also be invaluable for developing the gigabit state-wide network being built to serve millions of Californians. These contributions were recognized last month by CENIC, a not-for-profit corporation dedicated to facilitating advanced network services for research and education in California. Ultralight won CENIC’s second annual ‘On the Road to a Gigabit’ Award, announced in March, for “the best use of highperformance networking developed by a private/public partnership.”

UltraLight is building on an earlier triumph. At the Supercomputing 2003 conference: a team from SLAC, Caltech, LANL, CERN, Manchester and Amsterdam, with assistance from private industry, set a world record for fastest data transmission last November: 6.6 terabytes in 48 minutes, a rate of 23.2 gigabits per second (see TIP, Dec. 12, 2003). This is the equivalent of about 2,000 featurelength DVD-quality movies. The goal was to demonstrate what can be achieved with technologies that are readily available today.

“We wanted to open people’s eyes,” said SCS Assistant Director Les Cottrell, “and get them thinking, ‘What would I do if I had this [capability]?’”

“The way you move data shapes the design of experiments,” says Cottrell. “Until the late 1990’s, most high energy physics data were transferred on tapes. At SLAC alone, producing, packaging and shipping the tapes took the full-time effort of two people.” How long did it take for data from SLAC to reach a researcher at CERN? Two weeks.

In a broader context, Cottrell believes the team’s results will stimulate scientists and engineers in other areas to invent new models for collaboration in research and business. People around the world will be able to share their computing resources and information with unprecedented ease and efficiency.

 

The Stanford Linear Accelerator Center is managed by Stanford University for the US Department of Energy

Last update Tuesday March 30, 2004 by Emily Ball