Thomas Wallace Colthurst
8 Spencer Ave. #2
Somerville MA 02144
(617)-996-5220
thomaswc@gmail.com
Experience:
- Software Engineer at Google, 2007 - present
I've worked on a variety of projects, including Google Health, the LIFE photo archive, Book Search, Ads Latency, Universal Search, and Image Search. I've also acted as a math / statistics / machine learning consultant to several more. Most recently (2012) I was the tech lead of the team that picked the images to use for the Knowledge Graph.
- Visiting Fellow at the Singularity Institute for Artificial Intelligence, August 2010
- Speech Recognition Researcher at BBN Technologies, 1997 - 2007
Specialized in large vocabulary, conversational telephone speech (CTS).
Research topics included unsupervised training, discriminative training,
word confidence estimation, realtime and sub-realtime decoding,
Arabic and Mandarin speech recognition, system combination, and
pronunciation modelling. Wrote and modified C, C++ and Perl programs
that run on several hundred Linux machines
controlled by Sun Grid Engine. System architect for BBN's submissions to
the 2000 Hub-5 English and Mandarin NIST evaluation, as well as
the 2003 and 2004 EARS Mandarin and Arabic CTS NIST evaluations. (BBN
won all of these evaluations except English in 2000).
- Head of MIT Experimental Study Group Mathematics Staff, 1994 - 1997
Recruited, supervised and evaluated ten undergraduate and graduate student instructors each semester.
- Programmer for money.com, 1994
Developed a “digital cash” system for internet purchases;
wrote vendor, customer, and server programs in C for
Windows and Linux.
- Programmer for Brown Geology Department, 1990-1992
Wrote “Pollen Viewer”, a C program for the visualization
of a 4-d database of fossilized pollen concentrations on a SGI
workstation.
- Programmer for Naval Ocean Systems Center, 1989-1990
Wrote a portable scientific graphing library in Fortran.
Also wrote Fortran programs for visualizing sonar response patterns.
Education:
- Ph.D. in Mathematics, MIT, 1992 - 1997
Wrote thesis on “Multidimensional Wavelets”
under Professor Gilbert Strang.
- Sc.B. in Mathematics, Brown University, 1989-1992
Papers:
- D. R. H. Miller, et al., "Rapid and accurate spoken term detection", INTERSPEECH 2007, pp. 314-317.
- T. Colthurst, et al., Parameter tuning for fast speech recognition", INTERSPEECH 2007, pp. 1477-1480.
- S. Matsoukas, et al., "Advances in Transcription of Broadcast News
and Conversational Telephone Speech Within the Combined EARS
BBN/LIMSI System", IEEE Transactions on Audio, Speech and Language
Processing, Vol. 14 Issue 5, Sept. 2006, pp. 1541-1566.
- T. Colthurst and M. Kleber, "A Gray Path on Binary Partitions", 2006.
- R. Prasad, et al., "The 2004 BBN/LIMSI 20xRT English Conversational Telephone Speech Recognition System", InterSpeech 2005.
- S. Abdou, et al., "The 2004 BBN Levantine Arabic and Mandarin CTS
Transcription Systems", Rich Transcription 2004.
- R. Schwartz, et al., "Speech Recognition in Multiple Languages and
Domains: The 2003 BBN/LIMSI EARS System", ICASSP 2004.
- T. Colthurst, "Novel Features of DM", 2002.
- S. Matsoukas, et al., "The 2001 Byblos English LVCSR System", ICASSP
2002.
- H. Shu, et al., "The BBN Bybos 2000 Conversational Mandarin LVCSR
System", 2000 Speech Transcription Workshop.
- T. Colthurst, et al., "The 2000 BBN Byblos LVCSR System", ICSLP 2000.
- J. Billa, et al., "Recent Experiments in LVCSR", ICASSP 1999.
- G. Zavaliagkos, et al., "Using Untranscribed Training Data to
Improve Performance", ICSLP 1998.
- T. Colthurst, "Multidimensional Wavelets", 1997.
- T. Colthurst, et al., "Networks Minimizing Length Plus the Number of
Steiner Points", Network Optimization Problems: Algorithms, Complexity,
and Applications, World Scientific 1993, pp. 23-36.
- T. Colthurst, "Fractal Polytopes", 1992.
- A. Knoerr, et al., "Dynamic Visualization of Late Quaternary Pollen Data",
Computing Science and Statistics, Interface '91.