Sequence Alignment Project Bridges Two Disciplines with One Algorithm

For University of Alberta researcher Dr Paul Lu, MACI provides the power to compete in an emerging area of bioinformatics. Luís research focuses on high performance computing tools and algorithms.† Recently, he has turned his attention to biological applications that require vast amounts of computational capabilities and memory.† Without access to resources like MACI Dr Lu would not be on the leading edge of bioinformatics research with his development of DNA and protein sequence alignment algorithms.

At present Dr Lu is working on a collaborative research project with Dr Jonathan Schaeffer, Dr Duane Szafron, and Dr David Wishart of the University of Alberta, which has the potential to dramatically change the everyday approach of biologists. The FastLSA pairwise sequence alignment algorithm was originally developed by Dr Schaeffer and Dr Szafron.† Dr Lu and his graduate student have taken this algorithm and improved its performance by parallelizing it to run on MACIís computers.† With the performance improvements, a biologist can now more quickly compare newly discovered sequences with thousands of known sequences and gain insight into the structure and function of genes and proteins.

The next phase of the project will tackle the multiple sequence alignment problem.† Interspecies comparisons of multiple DNA sequences require large computer memories and computational power.† Without high performance computational resources, such a comparison could take weeks or months to achieve a result, or the biologist has to settle for a non-optimal answer in less time. Dr Luís goal is to develop an algorithm that takes multiple sequence alignments two steps further. Firstly, the algorithm must be fast through the efficient use of memory and multiple processors. Secondly, Dr Luís algorithm must compute the optimal alignment, instead of an approximation.

With the continued support and access to high performance computational infrastructure, sequence alignment algorithms will take DNA and protein comparisons to a new dimension. Dr Lu hopes that the development of this algorithm will enable a breakthrough in the investigation of unknown DNA sequences and proteins, and in the future will lay the foundation for public and community access to revolutionary biological tools.

paullu@cs.ualberta.ca

Selected Publications

Kevin Charter, Adrian Driga, Paul Lu, Ian Parsons, Jonathan Schaeffer, and Duane Szafron.† FastLSA:† A Fast Linear-Space Algorithm for Sequence Alignment.† (In preparation)

Adrian Driga, Paul Lu, Jonathan Schaeffer, and Duane Szafron. Parallel Fast Linear Space Alignment . (In preparation)

Ernesto Novillo and Paul Lu. On-Line Debugging and Performance Monitoring with Barriers , 15th International Parallel and Distributed Processing Symposium (IPDPS) , San Francisco, California, U.S.A., April 23-27, 2001. To appear.

Paul Lu. Integrating Bulk-Data Transfer into the Aurora Distributed Shared Data System , Journal of Parallel and Distributed Computing, in press for 2001.

Christopher Dutchyn, Paul Lu, Duane Szafron, Steven Bromling, and Wade Holst. Multi-Dispatch in the Java Virtual Machine: Design and Implementation , 6th Conference on Object-Oriented Technologies and Systems (COOTS), San Antonio, Texas, U.S.A., January 29-February 2, 2001.

George Ma and Paul Lu. PBSWeb: A Web-based Interface to the Portable Batch System, 12th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), Las Vegas, Nevada, U.S.A., November 6-9, 2000, pp. 24-30.

Christopher Dutchyn, Paul Lu, Duane Szafron, Steve Bromling, and Wade Holst. Multi-Dispatch in the Java Virtual Machine: Design and Implementation , Poster in OOPSLA 2000 Companion, October 15-19, 2000.

Paul Lu. Implementing Scoped Behaviour for Flexible Distributed Data Sharing , IEEE Concurrency, vol 8, no. 3, pp. 63-73, July-September 2000.

Paul Lu. Scoped Behaviour for Optimized Distributed Data Sharing , Ph.D. thesis, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada, 2000.

Paul Lu. Using Scoped Behaviour to Optimize Data Sharing Idioms , in High Performance Cluster Computing: Programming and Applications, Volume 2, 1/e , Rajkumar Buyya (editor), Prentice Hall PTR, pp. 113-130, 1999. Draft.

Paul Lu. Implementing Optimized Distributed Data Sharing Using Scoped Behaviour and a Class Library , 3rd Conference on Object-Oriented Technologies and Systems (COOTS), Portland, Oregon, U.S.A., June 16-19, 1997, pp. 145-158.

Paul Lu. Aurora: Scoped Behaviour for Per-Context Optimized Distributed Data Sharing , 11th International Parallel Processing Symposium (IPPS), Geneva, Switzerland, April 1-5, 1997, pp. 467-473.

Gregory V. Wilson, and Paul Lu, editors. Parallel Programming Using C++ , MIT Press, July 1996, 750 pages, appendices, indices. Foreword by Bjarne Stroustrup.

Jonathan Schaeffer, Robert Lake, Paul Lu and Martin Bryant. Chinook: The World Man-Machine Checkers Champion , AI Magazine, vol. 17, no. 1, pp. 21-29, 1996.

Orran Krieger, Benjamin Gamsa, Karen Reid, Paul Lu, Eric Parsons and Michael Stumm, The Importance of Performance-Oriented Flexibility in System Software for Large-Scale Shared-Memory Multiprocessors , OOPSLA'94 Workshop on Flexibility in System Software, October 1994.

Gregory V. Wilson, Brent Gorda and Paul Lu. Twelve Ways to Make Sure Your Parallel Programming System Doesn't Make Others Look Bad, IEEE Computer, vol. 27, no. 10, pp. 112, 1994.

Robert Lake, Jonathan Schaeffer and Paul Lu. Solving Large Retrograde-Analysis Problems Using a Network of Workstations , in Advances in Computer Chess 7, H.J. van den Herik, I.S. Herschberg and J.W.H.M. Uiterwijk (editors), University of Limburg, Maastricht, Netherlands, pp. 135-162, 1994.