eScience Seminar – Carlos Guestrin – Wednesday, March 13, 2013 EE-303 4:00 PM

*GraphLab: Making Fast Machine Learning on Big Data Accessible to Data
Scientists*

Today, machine learning (ML) methods play a central role in industry and
science. The growth of the Web and improvements in sensor data collection
technology have been rapidly increasing the magnitude and complexity of the
ML tasks we must solve. This growth is driving the need for scalable,
parallel ML algorithms that can handle "Big Data."

Unfortunately, implementing efficient parallel ML algorithms is
challenging, and keep data scientists away from spending time on their most
important goal: scientific discovery.

In this talk, I will also describe the GraphLab framework, which naturally
expresses asynchronous, dynamic graph computations that are key for
state-of-the-art ML algorithms. When these algorithms are expressed in our
higher-level abstraction, GraphLab will effectively address many of the
underlying parallelism challenges, including data
distribution, optimized communication, and guaranteeing sequential
consistency, a property that is surprisingly important for many ML
algorithms. On a variety of large-scale tasks, GraphLab provides 20-100x
performance improvements over Hadoop. In recent months, GraphLab has
received tens of thousands of downloads, and is being actively used by a
number of startups, companies, research labs and universities.

This talk represents joint work with Yucheng Low, Joey Gonzalez, Aapo
Kyrola, Jay Gu, Danny Bickson, and Joseph Bradley.

*Carlos Guestrin (UW CSE) *

Carlos Guestrin is the Amazon Professor of Machine Learning at the Computer
Science & Engineering Department of the University of Washington. His
previous positions include Associate Professor at Carnegie Mellon
University and senior researcher at the Intel Research Lab in Berkeley, and
founder of a startup on user modeling and recommendations. Carlos received
his PhD and Master from Stanford University, and a Mechatronics Engineer
degree from the University of Sao Paulo, Brazil. Carlos’ work has been
recognized by awards at a number of conferences and two journals: KDD 2007
and 2010, IPSN 2005 and 2006, VLDB 2004, NIPS 2003 and 2007, UAI 2005, ICML
2005, AISTATS 2010, JAIR in 2007 & 2012, and JWRPM in 2009. He is also a
recipient of the ONR Young Investigator Award, NSF Career Award, Alfred P.
Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the
Stanford Centennial Teaching Assistant Award. Carlos was named one of the
2008 `Brilliant 10′ by Popular Science Magazine, received the IJCAI
Computers and Thought Award and the Presidential Early Career Award for
Scientists and Engineers (PECASE). He is a former member of the Information
Sciences and Technology (ISAT) advisory group for DARPA.

*Upcoming Seminars:*

* April 11, 4 PM (EE303)

*Barry Wark *(Physion Consulting)

TBD

* May 1, 4 PM (EE303)

*Jeff Gardner *(UW)

Simulating the Universe on Google’s Exacycle Platform

* May 13, 4 PM (EE303)

*Fernando Perez *(Berkeley)

TBD

Advertisements