Twitter Search + big Hadoop, Dec. 8th at Seattle Scalability Meetup

2010-11-30 Thread Bradford Stephens
Greetings,

The Seattle Scalability Meetup isn't slacking for the holidays. We've
got an awesome lineup for Wed, December 8 at 7pm:

http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/

-Jake Mannix from Twitter will talk about the Twitter Search
infrastructure (with distributed Lucene)
-Chris Nauroth from a Really Large Media Company will talk about how
they use Hadoop
-We may also have someone from Elastic MapReduce talk about their
really cool new stuff.

The meetup is at Amazon South Lake Union campus, and usually goes from
7pm-8:30pm. Drinks are at Fierabend afterward, at 422 Yale Ave N.

Address and more information here:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/

Look forward to seeing you all!

Cheers,
B

-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle Scalability Meetup: Rackspace OpenStack, Karmasphere Hadoop, Wed Oct 27

2010-10-25 Thread Bradford Stephens
Link/Details:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/calendar/13704371/

This meetup focuses on Scalability and technologies to enable handling
large amounts of data: Hadoop, HBase, distributed NoSQL databases, and
more! There's not only a focus on technology, but also everything
surrounding it including operations, management, business use cases,
etc. We've had great success in the past, and are growing quickly!
Including guests from LinkedIn, Amazon, Twitter, Facebook, Cloudant,
and 10gen/MongoDB.

This month's guests:
Mike Mayo, Rackspace, Learn details on Rackspace's new Open Cloud
offering -- a complete scalable cloud stack, but open source!
Abe Taha, VP Engineering, Karmasphere: Karmasphere produces a Hadoop
development environment. Learn more about working with Hadoop
effectively, and see their exciting new offerings.

Location:
Amazon HQ, Von Vorst Building, 426 Terry Ave N., Seattle, WA 98109-5210

Afterparty:
Fierabend, 422 Yale Ave N


--
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle Hadoop/NoSQL: Facebook, more Discussion. Thurs May 27th

2010-05-13 Thread Bradford Stephens
We've heard your feedback from the last meetup: we're having less
speakers and more discussion. Yay!
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/

We're expecting:

1. Facebook will talk about Hive (a SQL-like language for MapReduce)
2. OpsCode will talk about cluster management with Chef
3. Then we'll break up into groups and have casual Hadoop/NoSQL
related discussions and Q&A with several experts, so you can learn
more!

Also, stay tuned for news on a FREE Seattle Hadoop Community &
Training day in late July. We're going to get some fantastic people,
and you'll have hands-on experience with all the Hadoop ecosystem.

When: Thursday, May 27, 2010 6:45 PM

Where:
Amazon SLU, Von Vorst Building
426 Terry Ave N
Seattle, WA 98109
9044153009

-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


April Seattle Hadoop/Scalability/NoSQL Meetup: Cassandra, Science, More!

2010-04-21 Thread Bradford Stephens
Hey there! Wanted to let you all know about our next meetup, April
28th. We've got a killer new venue thanks to Amazon.

Check out the details at the link:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/calendar/13072272/

Our Speakers this month:
1. Nick Dimiduk, Drawn to Scale: Intro to Hadoop, HBase, and NoSQL
2. Benjamin Black: Intro to Cassandra
3. Adam Jacob, CTO, OpsCode: Chef and Cluster Management
4. Sarah Killcoyne, Systems Biology: Big Data in Science

We've had great success in the past, and are growing quickly!
Including guests from LinkedIn, Amazon, Cloudant, 10gen/MongoDB, and
more.

Our format is flexible: We usually have speakers who talk for ~20
minutes each and then do Q+A, plus lightning talks, dicussion, and
then social time.

There'll be beer afterwards, of course! Fierabend, 422 Yale Ave N

Meetup Location:
Amazon HQ, Von Vorst Building, 426 Terry Ave N., Seattle, WA 98109-5210

Hope to see you there! And we're always open to suggestions.

-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Experience with indexing billions of documents?

2010-04-13 Thread Bradford Stephens
Hey there,

We've actually been tackling this problem at Drawn to Scale. We'd really
like to get our hands on LuceHBase to see how it scales. Our faceting still
needs to be done in-memory, which is kinda tricky, but it's worth
exploring.

On Mon, Apr 12, 2010 at 7:27 AM, Thomas Koch  wrote:

> Hi,
>
> could I interest you in this project?
> http://github.com/thkoch2001/lucehbase
>
> The aim is to store the index directly in HBase, a database system modelled
> after google's Bigtable to store data in the regions of tera or petabytes.
>
> Best regards, Thomas Koch
>
> Lance Norskog:
> > The 2B limitation is within one shard, due to using a signed 32-bit
> > integer. There is no limit in that regard in sharding- Distributed
> > Search uses the stored unique document id rather than the internal
> > docid.
> >
> > On Fri, Apr 2, 2010 at 10:31 AM, Rich Cariens 
> wrote:
> > > A colleague of mine is using native Lucene + some home-grown
> > > patches/optimizations to index over 13B small documents in a 32-shard
> > > environment, which is around 406M docs per shard.
> > >
> > > If there's a 2B doc id limitation in Lucene then I assume he's patched
> it
> > > himself.
> > >
> > > On Fri, Apr 2, 2010 at 1:17 PM,  wrote:
> > >> My guess is that you will need to take advantage of Solr 1.5's
> upcoming
> > >> cloud/cluster renovations and use multiple indexes to comfortably
> > >> achieve those numbers. Hypthetically, in that case, you won't be
> limited
> > >> by single index docid limitations of Lucene.
> > >>
> > >> > We are currently indexing 5 million books in Solr, scaling up over
> the
> > >> > next few years to 20 million.  However we are using the entire book
> as
> > >> > a Solr document.  We are evaluating the possibility of indexing
> > >> > individual pages as there are some use cases where users want the
> most
> > >> > relevant
> > >>
> > >> pages
> > >>
> > >> > regardless of what book they occur in.  However, we estimate that we
> > >> > are talking about somewhere between 1 and 6 billion pages and have
> > >> > concerns over whether Solr will scale to this level.
> > >> >
> > >> > Does anyone have experience using Solr with 1-6 billion Solr
> > >> > documents?
> > >> >
> > >> > The lucene file format document
> > >> > (http://lucene.apache.org/java/3_0_1/fileformats.html#Limitations)
> > >> > mentions a limit of about 2 billion document ids.   I assume this is
> > >> > the lucene internal document id and would therefore be a per
> index/per
> > >> > shard limit.  Is this correct?
> > >> >
> > >> >
> > >> > Tom Burton-West.
> >
>
> Thomas Koch, http://www.koch.ro
>



-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data solution.
Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science


Seattle Hadoop/Scalability/NoSQL Meetup Wednesday, March 31st. w/ LinkedIn's Jake Mannix

2010-03-24 Thread Bradford Stephens
Greetings,

Don't forget that the Hadoop/Scalability/NoSQL meetup is next
Wednesday, March 31st at 6:45pm! We're going to have a very exciting
guest: Jake Mannix from LinkedIn will talk about machine learning on
Hadoop. He's a well-decorated engineer across many disciplines, and
even knows quite a bit about distributed search with Lucene.

We may also hear form Sarah Killcoyne from Systems Biology. She'll be
talking about Big Data in the biology / research fields.

Check out the details here:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/

Hope to see you there!

Cheers,
Bradford

-- 
http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Seattle Hadoop/Scalability/NoSQL Meetup Tonight!

2010-02-25 Thread Bradford Stephens
Thanks for coming, everyone! We had around 25 people. A *huge*
success, for Seattle. And a big thanks to 10gen for sending Richard.

Can't wait to see you all next month.

On Wed, Feb 24, 2010 at 2:15 PM, Bradford Stephens
 wrote:
> The Seattle Hadoop/Scalability/NoSQL (yeah, we vary the title) meetup
> is tonight! We're going to have a guest speaker from MongoDB :)
>
> As always, it's at the University of Washington, Allen Computer
> Science building, Room 303 at 6:45pm. You can find a map here:
> http://www.washington.edu/home/maps/southcentral.html?cse
>
> If you can, please RSVP here (not required, but very nice):
> http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/
>
> --
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> solution. Process, store, query, search, and serve all your data.
>
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle Hadoop/Scalability/NoSQL Meetup Tonight!

2010-02-24 Thread Bradford Stephens
The Seattle Hadoop/Scalability/NoSQL (yeah, we vary the title) meetup
is tonight! We're going to have a guest speaker from MongoDB :)

As always, it's at the University of Washington, Allen Computer
Science building, Room 303 at 6:45pm. You can find a map here:
http://www.washington.edu/home/maps/southcentral.html?cse

If you can, please RSVP here (not required, but very nice):
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/

--
http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle Hadoop/Lucene/NoSQL Meetup; Wed Feb 24th, Feat. MongoDB

2010-02-16 Thread Bradford Stephens
Greetings,

It's time for another awesome Seattle Hadoop/Lucene/Scalability/NoSQL Meetup!

As always, it's at the University of Washington, Allen Computer
Science building, Room 303 at 6:45pm. You can find a map here:
http://www.washington.edu/home/maps/southcentral.html?cse

Last month, we had a great talk from Steve McPherson of Razorfish on
their usage of Hadoop. This month, we'll have Richard Kreuter from
MongoDB talking about, well, MongoDB. As well as assorted discussion
on the Hadoop ecosystem.

If you can, please RSVP here (not required, but very nice):
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/

My cell # is 904-415-3009 if you have questions/get lost.

Cheers,
Bradford

-- 
http://www.drawntoscalehq.com -- Big Data for all. The Big Data Platform.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Reminder: Seattle Hadoop / HBase / Lucene / NoSQL meetup Jan 27th! Feat. Razorfish

2010-01-25 Thread Bradford Stephens
Greetings,

I'm in the Bay Area doing startup-stuff this week, so Nick Dimiduk
will be running this meetup again. You can reach him at
ndimi...@gmail.com and 614-657-0267

A friendly reminder that the Seattle Hadoop, NoSQL, etc. meetup is on
January 27th at University of Washington in the Allen Computer Science
Building, room 303.

I believe Razorfish will be giving a talk on how they use Hadoop.

Here's the new, shiny meetup.com link with more detail:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup

-- 
http://www.drawntoscalehq.com -- Big Data for all. The Big Data Platform.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle Hadoop / HBase / Lucene / NoSQL meetup Jan 27th!

2010-01-11 Thread Bradford Stephens
Greetings,

A friendly reminder that the Seattle Hadoop, NoSQL, etc. meetup is on
January 27th at University of Washington in the Allen Computer Science
Building, room 303.

I believe Razorfish will be giving a talk on how they use Hadoop.

Here's the new, shiny meetup.com link with more detail:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup

-- 
http://www.drawntoscalehq.com -- Big Data for all. The Big Data Platform.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Seattle / NW Hadoop, Lucene, Apache "Cloud Stack" Meetup, Wed Oct 28 6:45pm

2009-10-27 Thread Bradford Stephens
Hey guys! Don't forget this is tomorrow (Wednesday). See you there!

Cheers,
Bradford

On Sun, Oct 18, 2009 at 5:10 PM, Bradford Stephens
 wrote:
> Greetings,
>
> (You're receiving this e-mail because you're on a DL or I think you'd
> be interested)
>
> It's time for another Hadoop/Lucene/Apache "Cloud" stack meetup! This
> month it'll be on Wednesday, the 28th, at 6:45 pm.
>
> A *huge* thanks for everyone who showed up last month, and to Facebook
> for sending someone awesome to speak about Hive. We learned quite a
> bit!
>
> For October, we will have someone speaking about Cascading, and how it
> helps workflow abstraction with MapReduce. Very useful stuff to know.
>
> We've had great attendance in the past few months, let's keep it up!
> I'm always amazed by the things I learn from everyone.
>
> We're at the University of Washington, Allen Computer Science Center
> (not Electrical Engineering)
>
> Map: http://www.washington.edu/home/maps/?CSE
>
> Room: 303 -or- the Entry level. If there are changes, signs will be posted.
>
> More Info:
>
> The meetup is about 2 hours (and there's usually food): we'll have two
> in-depth talks, and then several "lightning talks" of 5 minutes. We'll
> then have discussion and 'social time'. Let me know if you're
> interested in speaking or attending. We'd like to focus on education,
> so feel free to ask questions.
>
> Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com
>
> --
> http://www.drawntoscaleconsulting.com - Scalability, Hadoop, HBase,
> and Distributed Lucene Consulting
>
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.drawntoscaleconsulting.com - Scalability, Hadoop, HBase,
and Distributed Lucene Consulting

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle / NW Hadoop, Lucene, Apache "Cloud Stack" Meetup, Wed Oct 28 6:45pm

2009-10-18 Thread Bradford Stephens
Greetings,

(You're receiving this e-mail because you're on a DL or I think you'd
be interested)

It's time for another Hadoop/Lucene/Apache "Cloud" stack meetup! This
month it'll be on Wednesday, the 28th, at 6:45 pm.

A *huge* thanks for everyone who showed up last month, and to Facebook
for sending someone awesome to speak about Hive. We learned quite a
bit!

For October, we will have someone speaking about Cascading, and how it
helps workflow abstraction with MapReduce. Very useful stuff to know.

We've had great attendance in the past few months, let's keep it up!
I'm always amazed by the things I learn from everyone.

We're at the University of Washington, Allen Computer Science Center
(not Electrical Engineering)

Map: http://www.washington.edu/home/maps/?CSE

Room: 303 -or- the Entry level. If there are changes, signs will be posted.

More Info:

The meetup is about 2 hours (and there's usually food): we'll have two
in-depth talks, and then several "lightning talks" of 5 minutes. We'll
then have discussion and 'social time'. Let me know if you're
interested in speaking or attending. We'd like to focus on education,
so feel free to ask questions.

Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com

-- 
http://www.drawntoscaleconsulting.com - Scalability, Hadoop, HBase,
and Distributed Lucene Consulting

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-09-28 Thread Bradford Stephens
Hello everyone!
Don't forget that the Meetup is THIS Wednesday! I'm looking forward to
hearing about Hive from the Facebook team ... and there might be a few other
interesting talks as well. Here's the details in the wiki:
http://wiki.apache.org/hadoop/PNW_Hadoop_%2B_Apache_Cloud_Stack_User_Group

Cheers,
Bradford

On Mon, Sep 14, 2009 at 11:35 AM, Bradford Stephens <
bradfordsteph...@gmail.com> wrote:

> Greetings,
>
> It's time for another Hadoop/Lucene/Apache"Cloud"  Stack meetup!
> This month it'll be on Wednesday, the 30th, at 6:45 pm.
>
> We should have a few interesting guests this time around -- someone from
> Facebook may be stopping by to talk about Hive :)
>
> We've had great attendance in the past few months, let's keep it up! I'm
> always
> amazed by the things I learn from everyone.
>
> We're back at the University of Washington, Allen Computer Science
> Center (not Computer Engineering)
> Map: http://www.washington.edu/home/maps/?CSE
>
> Room: 303 -or- the Entry level. If there are changes, signs will be posted.
>
> More Info:
>
> The meetup is about 2 hours (and there's usually food): we'll have two
> in-depth talks of 15-20
> minutes each, and then several "lightning talks" of 5 minutes. If no
> one offers, We'll then have discussion and 'social time'.  we'll just
> have general discussion. Let net know if you're interested in speaking
> or attending. We'd like to focus on education, so every presentation
> *needs* to ask some questions at the end. We can talk about these
> after the presentations, and I'll record what we've learned in a wiki
> and share that with the rest of us.
>
> Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com
>
> Cheers,
> Bradford
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science


Re: Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-09-24 Thread Bradford Stephens
Friendly Reminder! One week to go.

On Mon, Sep 14, 2009 at 11:35 AM, Bradford Stephens <
bradfordsteph...@gmail.com> wrote:

> Greetings,
>
> It's time for another Hadoop/Lucene/Apache"Cloud"  Stack meetup!
> This month it'll be on Wednesday, the 30th, at 6:45 pm.
>
> We should have a few interesting guests this time around -- someone from
> Facebook may be stopping by to talk about Hive :)
>
> We've had great attendance in the past few months, let's keep it up! I'm
> always
> amazed by the things I learn from everyone.
>
> We're back at the University of Washington, Allen Computer Science
> Center (not Computer Engineering)
> Map: http://www.washington.edu/home/maps/?CSE
>
> Room: 303 -or- the Entry level. If there are changes, signs will be posted.
>
> More Info:
>
> The meetup is about 2 hours (and there's usually food): we'll have two
> in-depth talks of 15-20
> minutes each, and then several "lightning talks" of 5 minutes. If no
> one offers, We'll then have discussion and 'social time'.  we'll just
> have general discussion. Let net know if you're interested in speaking
> or attending. We'd like to focus on education, so every presentation
> *needs* to ask some questions at the end. We can talk about these
> after the presentations, and I'll record what we've learned in a wiki
> and share that with the rest of us.
>
> Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com
>
> Cheers,
> Bradford
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science


Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-09-14 Thread Bradford Stephens
Greetings,

It's time for another Hadoop/Lucene/Apache"Cloud"  Stack meetup!
This month it'll be on Wednesday, the 30th, at 6:45 pm.

We should have a few interesting guests this time around -- someone from
Facebook may be stopping by to talk about Hive :)

We've had great attendance in the past few months, let's keep it up! I'm always
amazed by the things I learn from everyone.

We're back at the University of Washington, Allen Computer Science
Center (not Computer Engineering)
Map: http://www.washington.edu/home/maps/?CSE

Room: 303 -or- the Entry level. If there are changes, signs will be posted.

More Info:

The meetup is about 2 hours (and there's usually food): we'll have two
in-depth talks of 15-20
minutes each, and then several "lightning talks" of 5 minutes. If no
one offers, We'll then have discussion and 'social time'.  we'll just
have general discussion. Let net know if you're interested in speaking
or attending. We'd like to focus on education, so every presentation
*needs* to ask some questions at the end. We can talk about these
after the presentations, and I'll record what we've learned in a wiki
and share that with the rest of us.

Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com

Cheers,
Bradford
-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Seattle / NW Hadoop, HBase Lucene, etc. Meetup , Wed August 26th, 6:45pm

2009-08-26 Thread Bradford Stephens

Hello,

My apologies, but there was a mix-up reserving our meeting location,  
and we don't have access to it.


I'm very sorry, and beer is on me next month. Promise :)

Sent from my Internets

On Aug 25, 2009, at 4:21 PM, Bradford Stephens > wrote:



Hey there,

Apologies for this not going out sooner -- apparently it was sitting
as a draft in my inbox. A few of you have pinged me, so thanks for
your vigilance.

It's time for another Hadoop/Lucene/Apache Stack meetup! We've had
great attendance in the past few months, let's keep it up! I'm always
amazed by the things I learn from everyone.

We're back at the University of Washington, Allen Computer Science
Center (not Computer Engineering)
Map: http://www.washington.edu/home/maps/?CSE

Room: 303 -or- the Entry level. If there are changes, signs will be  
posted.


More Info:

The meetup is about 2 hours: we'll have two in-depth talks of 15-20
minutes each, and then several "lightning talks" of 5 minutes. If no
one offers, We'll then have discussion and 'social time'.  we'll just
have general discussion. Let net know if you're interested in speaking
or attending. We'd like to focus on education, so every presentation
*needs* to ask some questions at the end. We can talk about these
after the presentations, and I'll record what we've learned in a wiki
and share that with the rest of us.

Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com

--
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Seattle / NW Hadoop, HBase Lucene, etc. Meetup , Wed August 26th, 6:45pm

2009-08-25 Thread Bradford Stephens
Hey there,

Apologies for this not going out sooner -- apparently it was sitting
as a draft in my inbox. A few of you have pinged me, so thanks for
your vigilance.

It's time for another Hadoop/Lucene/Apache Stack meetup! We've had
great attendance in the past few months, let's keep it up! I'm always
amazed by the things I learn from everyone.

We're back at the University of Washington, Allen Computer Science
Center (not Computer Engineering)
Map: http://www.washington.edu/home/maps/?CSE

Room: 303 -or- the Entry level. If there are changes, signs will be posted.

More Info:

The meetup is about 2 hours: we'll have two in-depth talks of 15-20
minutes each, and then several "lightning talks" of 5 minutes. If no
one offers, We'll then have discussion and 'social time'.  we'll just
have general discussion. Let net know if you're interested in speaking
or attending. We'd like to focus on education, so every presentation
*needs* to ask some questions at the end. We can talk about these
after the presentations, and I'll record what we've learned in a wiki
and share that with the rest of us.

Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com

--
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Language Detection for Analysis?

2009-08-06 Thread Bradford Stephens
Hey there,

We're trying to add foreign language support into our new search
engine -- languages like Arabic, Farsi, and Urdu (that don't work with
standard analyzers). But our data source doesn't tell us which
languages we're actually collecting -- we just get blocks of text. Has
anyone here worked on language detection so we can figure out what
analyzers to use? Are there commercial solutions?

Much appreciated!

-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: THIS WEEK: PNW Hadoop, HBase / Apache Cloud Stack Users' Meeting, Wed Jul 29th, Seattle

2009-08-05 Thread Bradford Stephens
A big "thanks" to everyone who came out despite the heat! Hope to see
you again the last week of August, probably at UW.

On Wed, Jul 29, 2009 at 4:52 PM, Bradford
Stephens wrote:
> Don't forget this is tonight! Excited to see everyone there.
>
> On Tue, Jul 28, 2009 at 11:25 AM, Bradford
> Stephens wrote:
>> Hey everyone,
>>
>> SLIGHT change of plans.
>>
>> A few people have asked me to move to a place with Air Conditioning,
>> since the temperature's in the 90's this week. So, here we go:
>>
>> Big Time Brewing Company
>> 4133 University Way NE
>> Seattle, WA 98105
>>
>> Call me at 904-415-3009 if you have any questions.
>>
>>
>> On Mon, Jul 27, 2009 at 12:16 PM, Bradford
>> Stephens wrote:
>>> Hello again!
>>>
>>> Yes, I know some of us are still recovering from OSCON. It's time for
>>> another delicious meetup to chat about Hadoop, HBase, Solr, Lucene,
>>> and more!
>>>
>>> UW is quite a pain for us to access until August, so we're changing
>>> the venue to one pretty close:
>>>
>>> Piccolo's Pizza
>>> 5301 Roosevelt Way NE
>>> (between 53rd St & 55th St)
>>>
>>> 6:45pm - 8:30 (or when we get bored)!
>>>
>>> As usual, people are more than welcome to give talks, whether they're
>>> long-format or lightning. I'd also really like to start thinking about
>>> hackathons, perhaps we could have one next month?
>>>
>>> I'll be talking about HBase .20 and the possibility of low-latency
>>> HBase Analytics. I'd be very excited to hear what people are up to!
>>>
>>> Contact me if there's any questions: 904-415-3009
>>>
>>> Cheers,
>>> Bradford
>>>
>>> --
>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>> Media, and Computer Science
>>>
>>
>>
>>
>> --
>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>> Media, and Computer Science
>>
>
>
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


A Presentation on Building a Hadoop + Lucene System Architecture

2009-08-04 Thread Bradford Stephens
Hey all,

I just wanted to send a link to a presentation I made on how my
company is building its entire core BI infrastructure around Hadoop,
HBase, Lucene, and more. It features a decent amount of practical
advice: from rules for approaching scalability problems, to why we
chose certain aspects of the Hadoop Ecosystem. Perhaps you can use it
as justification for their decisions, or as a jumping-off point to
utilizing it in the real world.

I hope you find it helpful! You can catch it at my blog:
http://www.roadtofailure.com . There's also a few inflammatory
articles, such as "Social Media Kills the RDBMS".

Ask me if you have any questions :)

-- 
http://www.hadoopconsulting.com -- Making Hadoop and your web apps
that use it scale
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: THIS WEEK: PNW Hadoop, HBase / Apache Cloud Stack Users' Meeting, Wed Jul 29th, Seattle

2009-07-29 Thread Bradford Stephens
Don't forget this is tonight! Excited to see everyone there.

On Tue, Jul 28, 2009 at 11:25 AM, Bradford
Stephens wrote:
> Hey everyone,
>
> SLIGHT change of plans.
>
> A few people have asked me to move to a place with Air Conditioning,
> since the temperature's in the 90's this week. So, here we go:
>
> Big Time Brewing Company
> 4133 University Way NE
> Seattle, WA 98105
>
> Call me at 904-415-3009 if you have any questions.
>
>
> On Mon, Jul 27, 2009 at 12:16 PM, Bradford
> Stephens wrote:
>> Hello again!
>>
>> Yes, I know some of us are still recovering from OSCON. It's time for
>> another delicious meetup to chat about Hadoop, HBase, Solr, Lucene,
>> and more!
>>
>> UW is quite a pain for us to access until August, so we're changing
>> the venue to one pretty close:
>>
>> Piccolo's Pizza
>> 5301 Roosevelt Way NE
>> (between 53rd St & 55th St)
>>
>> 6:45pm - 8:30 (or when we get bored)!
>>
>> As usual, people are more than welcome to give talks, whether they're
>> long-format or lightning. I'd also really like to start thinking about
>> hackathons, perhaps we could have one next month?
>>
>> I'll be talking about HBase .20 and the possibility of low-latency
>> HBase Analytics. I'd be very excited to hear what people are up to!
>>
>> Contact me if there's any questions: 904-415-3009
>>
>> Cheers,
>> Bradford
>>
>> --
>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>> Media, and Computer Science
>>
>
>
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: THIS WEEK: PNW Hadoop, HBase / Apache Cloud Stack Users' Meeting, Wed Jul 29th, Seattle

2009-07-28 Thread Bradford Stephens
Hey everyone,

SLIGHT change of plans.

A few people have asked me to move to a place with Air Conditioning,
since the temperature's in the 90's this week. So, here we go:

Big Time Brewing Company
4133 University Way NE
Seattle, WA 98105

Call me at 904-415-3009 if you have any questions.


On Mon, Jul 27, 2009 at 12:16 PM, Bradford
Stephens wrote:
> Hello again!
>
> Yes, I know some of us are still recovering from OSCON. It's time for
> another delicious meetup to chat about Hadoop, HBase, Solr, Lucene,
> and more!
>
> UW is quite a pain for us to access until August, so we're changing
> the venue to one pretty close:
>
> Piccolo's Pizza
> 5301 Roosevelt Way NE
> (between 53rd St & 55th St)
>
> 6:45pm - 8:30 (or when we get bored)!
>
> As usual, people are more than welcome to give talks, whether they're
> long-format or lightning. I'd also really like to start thinking about
> hackathons, perhaps we could have one next month?
>
> I'll be talking about HBase .20 and the possibility of low-latency
> HBase Analytics. I'd be very excited to hear what people are up to!
>
> Contact me if there's any questions: 904-415-3009
>
> Cheers,
> Bradford
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: THIS WEEK: PNW Hadoop, HBase / Apache Cloud Stack Users' Meeting, Wed Jul 29th, Seattle

2009-07-28 Thread Bradford Stephens
On Mon, Jul 27, 2009 at 12:16 PM, Bradford
Stephens wrote:
> Hello again!
>
> Yes, I know some of us are still recovering from OSCON. It's time for
> another delicious meetup to chat about Hadoop, HBase, Solr, Lucene,
> and more!
>
> UW is quite a pain for us to access until August, so we're changing
> the venue to one pretty close:
>
> Piccolo's Pizza
> 5301 Roosevelt Way NE
> (between 53rd St & 55th St)
>
> 6:45pm - 8:30 (or when we get bored)!
>
> As usual, people are more than welcome to give talks, whether they're
> long-format or lightning. I'd also really like to start thinking about
> hackathons, perhaps we could have one next month?
>
> I'll be talking about HBase .20 and the possibility of low-latency
> HBase Analytics. I'd be very excited to hear what people are up to!
>
> Contact me if there's any questions: 904-415-3009
>
> Cheers,
> Bradford
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


THIS WEEK: PNW Hadoop, HBase / Apache Cloud Stack Users' Meeting, Wed Jul 29th, Seattle

2009-07-27 Thread Bradford Stephens
Hello again!

Yes, I know some of us are still recovering from OSCON. It's time for
another delicious meetup to chat about Hadoop, HBase, Solr, Lucene,
and more!

UW is quite a pain for us to access until August, so we're changing
the venue to one pretty close:

Piccolo's Pizza
5301 Roosevelt Way NE
(between 53rd St & 55th St)

6:45pm - 8:30 (or when we get bored)!

As usual, people are more than welcome to give talks, whether they're
long-format or lightning. I'd also really like to start thinking about
hackathons, perhaps we could have one next month?

I'll be talking about HBase .20 and the possibility of low-latency
HBase Analytics. I'd be very excited to hear what people are up to!

Contact me if there's any questions: 904-415-3009

Cheers,
Bradford

-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Aggregating/Grouping Document Search Results on a Field

2009-07-13 Thread Bradford Stephens
Thanks for this -- we're also trying out bobo-browse for Lucene, and
early results look pretty enticing. They greatly sped up how fast you
read in documents from disk, among other things:
http://bobo-browse.wiki.sourceforge.net/

On Sat, Jul 11, 2009 at 12:10 AM, Shalin Shekhar
Mangar wrote:
> On Sat, Jul 11, 2009 at 12:01 AM, Bradford Stephens <
> bradfordsteph...@gmail.com> wrote:
>
>> Does the facet aggregation take place on the Solr search server, or
>> the Solr client?
>>
>> It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50
>> million document index (about 36M unique values in the "author"
>> field), a query that returns 131,000 hits takes about 20 seconds to
>> calculate the top 50 authors. The query I'm running is this:
>>
>>
>> http://dttest10:8983/solr/select/select?q=java&facet=true&facet.field=authorname
>> :
>>
>>
> Is the author field tokenized? Is it multi-valued? It is best to have
> untokenized fields.
>
> Solr 1.4 has huge improvements in faceting performance so you can try that
> and see if it helps. See Yonik's blog post about this -
> http://yonik.wordpress.com/2008/11/25/solr-faceted-search-performance-improvements/
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Aggregating/Grouping Document Search Results on a Field

2009-07-10 Thread Bradford Stephens
Does the facet aggregation take place on the Solr search server, or
the Solr client?

It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50
million document index (about 36M unique values in the "author"
field), a query that returns 131,000 hits takes about 20 seconds to
calculate the top 50 authors. The query I'm running is this:

http://dttest10:8983/solr/select/select?q=java&facet=true&facet.field=authorname:



On Thu, Jul 9, 2009 at 10:32 PM, Bradford
Stephens wrote:
> Oh, wow... I think that faceted search is the right path, especially
> since seeing this amazing site:
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr
>
> I hope it's performant over hundreds of thousands of search results :)
>
> On Thu, Jul 9, 2009 at 10:13 PM, Bradford
> Stephens wrote:
>> It looks like field collapsing may be the key:
>> http://issues.apache.org/jira/browse/SOLR-236
>>
>> But it also doesn't seem to be 'finalized' yet. I wonder how
>> performant it is with indexes of 50 million documents+?
>>
>> On Thu, Jul 9, 2009 at 9:42 PM, shb wrote:
>>> you can refer to the facet search of solr, that might help you.
>>>
>>> 2009/7/10 Bradford Stephens 
>>>
>>>> Greetings,
>>>>
>>>> We've been experimenting with grouping fields returned from document
>>>> search results in Lucene, and we haven't gotten anything very
>>>> encouraging. Basically, the more results we return, the longer it
>>>> takes -- tens of seconds. Probably because we're doing expensive disks
>>>> seeks. I'm hoping the SOLR crew out there may provide some insight :)
>>>>
>>>> What we're trying to do is similar to SQL's "GROUP BY".  Let's say we
>>>> have documents indexed by keyword for a content body, and also indexed
>>>> by an Author name. If I search our document store (very large) for the
>>>> word "laptop", I would like to be able to calculate the 10 authors
>>>> that appeared the most.
>>>>
>>>> I've done some searching through the mailing list, but couldn't glean
>>>> much insight. What do you think?
>>>>
>>>> --
>>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>>> Media, and Computer Science
>>>>
>>>
>>
>>
>>
>> --
>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>> Media, and Computer Science
>>
>
>
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Aggregating/Grouping Document Search Results on a Field

2009-07-09 Thread Bradford Stephens
Oh, wow... I think that faceted search is the right path, especially
since seeing this amazing site:
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr

I hope it's performant over hundreds of thousands of search results :)

On Thu, Jul 9, 2009 at 10:13 PM, Bradford
Stephens wrote:
> It looks like field collapsing may be the key:
> http://issues.apache.org/jira/browse/SOLR-236
>
> But it also doesn't seem to be 'finalized' yet. I wonder how
> performant it is with indexes of 50 million documents+?
>
> On Thu, Jul 9, 2009 at 9:42 PM, shb wrote:
>> you can refer to the facet search of solr, that might help you.
>>
>> 2009/7/10 Bradford Stephens 
>>
>>> Greetings,
>>>
>>> We've been experimenting with grouping fields returned from document
>>> search results in Lucene, and we haven't gotten anything very
>>> encouraging. Basically, the more results we return, the longer it
>>> takes -- tens of seconds. Probably because we're doing expensive disks
>>> seeks. I'm hoping the SOLR crew out there may provide some insight :)
>>>
>>> What we're trying to do is similar to SQL's "GROUP BY".  Let's say we
>>> have documents indexed by keyword for a content body, and also indexed
>>> by an Author name. If I search our document store (very large) for the
>>> word "laptop", I would like to be able to calculate the 10 authors
>>> that appeared the most.
>>>
>>> I've done some searching through the mailing list, but couldn't glean
>>> much insight. What do you think?
>>>
>>> --
>>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>>> Media, and Computer Science
>>>
>>
>
>
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Aggregating/Grouping Document Search Results on a Field

2009-07-09 Thread Bradford Stephens
It looks like field collapsing may be the key:
http://issues.apache.org/jira/browse/SOLR-236

But it also doesn't seem to be 'finalized' yet. I wonder how
performant it is with indexes of 50 million documents+?

On Thu, Jul 9, 2009 at 9:42 PM, shb wrote:
> you can refer to the facet search of solr, that might help you.
>
> 2009/7/10 Bradford Stephens 
>
>> Greetings,
>>
>> We've been experimenting with grouping fields returned from document
>> search results in Lucene, and we haven't gotten anything very
>> encouraging. Basically, the more results we return, the longer it
>> takes -- tens of seconds. Probably because we're doing expensive disks
>> seeks. I'm hoping the SOLR crew out there may provide some insight :)
>>
>> What we're trying to do is similar to SQL's "GROUP BY".  Let's say we
>> have documents indexed by keyword for a content body, and also indexed
>> by an Author name. If I search our document store (very large) for the
>> word "laptop", I would like to be able to calculate the 10 authors
>> that appeared the most.
>>
>> I've done some searching through the mailing list, but couldn't glean
>> much insight. What do you think?
>>
>> --
>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>> Media, and Computer Science
>>
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Aggregating/Grouping Document Search Results on a Field

2009-07-09 Thread Bradford Stephens
Greetings,

We've been experimenting with grouping fields returned from document
search results in Lucene, and we haven't gotten anything very
encouraging. Basically, the more results we return, the longer it
takes -- tens of seconds. Probably because we're doing expensive disks
seeks. I'm hoping the SOLR crew out there may provide some insight :)

What we're trying to do is similar to SQL's "GROUP BY".  Let's say we
have documents indexed by keyword for a content body, and also indexed
by an Author name. If I search our document store (very large) for the
word "laptop", I would like to be able to calculate the 10 authors
that appeared the most.

I've done some searching through the mailing list, but couldn't glean
much insight. What do you think?

-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: THIS WEEK: PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle

2009-06-25 Thread Bradford Stephens
Hey all,

Just writing a quick note of "thanks", we had another solid group of
people show up! As always, we learned quite a lot about interesting
use cases for Hadoop, Lucene, and the rest of the Apache 'Cloud
Stack'.

 I couldn't get it taped, but we talked about:

-Scaling Lucene with Katta and the Katta infrastructure
-the need for low-latency BI on distributed document stores
-Lots and lots of detail on Amazon Elastic MapReduce

We'll be doing it again next month --  July 29th.

On Mon, Jun 22, 2009 at 5:40 PM, Bradford
Stephens wrote:
> Hey all, just a friendly reminder that this is Wednesday! I hope to see
> everyone there again. Please let me know if there's something interesting
> you'd like to talk about -- I'll help however I can. You don't even need a
> Powerpoint presentation -- there's many whiteboards. I'll try to have a
> video cam, but no promises.
> Feel free to call at 904-415-3009 if you need directions or any questions :)
> ~~`
> Greetings,
>
> On the heels of our smashing success last month, we're going to be
> convening the Pacific Northwest (Oregon and Washington)
> Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the
> 24th.  The meeting should start at 6:45, organized chats will end
> around  8:00, and then there shall be discussion and socializing :)
>
> The meeting will be at the University of Washington in
> Seattle again. It's in the Computer Science building (not electrical
> engineering!), room 303, located
> here: http://www.washington.edu/home/maps/southcentral.html?80,70,792,660
>
> If you've ever wanted to learn more about distributed computing, or
> just see how other people are innovating with Hadoop, you can't miss
> this opportunity. Our focus is on learning and education, so every
> presentation must end with a few questions for the group to research
> and discuss. (But if you're an introvert, we won't mind).
>
> The format is two or three 15-minute "deep dive" talks, followed by
> several 5 minute "lightning chats". We had a few interesting topics
> last month:
>
> -Building a Social Media Analysis company on the Apache Cloud Stack
> -Cancer detection in images using Hadoop
> -Real-time OLAP on HBase -- is it possible?
> -Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS
> -Custom Ranking in Lucene
>
> We already have one "deep dive" scheduled this month, on truly
> scalable Lucene with Katta. If you've been looking for a way to handle
> those large Lucene indices, this is a must-attend!
>
> Looking forward to seeing everyone there again.
>
> Cheers,
> Bradford
>
> http://www.roadtofailure.com -- The Fringes of Distributed Computing,
> Computer Science, and Social Media.


Re: THIS WEEK: PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle

2009-06-23 Thread Bradford Stephens
Greetings,

I've gotten a few replies on this, but I'd really like to know who
else is coming. Just send me a quick note :)

Cheers,
Bradford

On Mon, Jun 22, 2009 at 5:40 PM, Bradford
Stephens wrote:
> Hey all, just a friendly reminder that this is Wednesday! I hope to see
> everyone there again. Please let me know if there's something interesting
> you'd like to talk about -- I'll help however I can. You don't even need a
> Powerpoint presentation -- there's many whiteboards. I'll try to have a
> video cam, but no promises.
> Feel free to call at 904-415-3009 if you need directions or any questions :)
> ~~`
> Greetings,
>
> On the heels of our smashing success last month, we're going to be
> convening the Pacific Northwest (Oregon and Washington)
> Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the
> 24th.  The meeting should start at 6:45, organized chats will end
> around  8:00, and then there shall be discussion and socializing :)
>
> The meeting will be at the University of Washington in
> Seattle again. It's in the Computer Science building (not electrical
> engineering!), room 303, located
> here: http://www.washington.edu/home/maps/southcentral.html?80,70,792,660
>
> If you've ever wanted to learn more about distributed computing, or
> just see how other people are innovating with Hadoop, you can't miss
> this opportunity. Our focus is on learning and education, so every
> presentation must end with a few questions for the group to research
> and discuss. (But if you're an introvert, we won't mind).
>
> The format is two or three 15-minute "deep dive" talks, followed by
> several 5 minute "lightning chats". We had a few interesting topics
> last month:
>
> -Building a Social Media Analysis company on the Apache Cloud Stack
> -Cancer detection in images using Hadoop
> -Real-time OLAP on HBase -- is it possible?
> -Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS
> -Custom Ranking in Lucene
>
> We already have one "deep dive" scheduled this month, on truly
> scalable Lucene with Katta. If you've been looking for a way to handle
> those large Lucene indices, this is a must-attend!
>
> Looking forward to seeing everyone there again.
>
> Cheers,
> Bradford
>
> http://www.roadtofailure.com -- The Fringes of Distributed Computing,
> Computer Science, and Social Media.


THIS WEEK: PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle

2009-06-22 Thread Bradford Stephens
Hey all, just a friendly reminder that this is Wednesday! I hope to see
everyone there again. Please let me know if there's something interesting
you'd like to talk about -- I'll help however I can. You don't even need a
Powerpoint presentation -- there's many whiteboards. I'll try to have a
video cam, but no promises.
Feel free to call at 904-415-3009 if you need directions or any questions :)

~~`

Greetings,

On the heels of our smashing success last month, we're going to be
convening the Pacific Northwest (Oregon and Washington)
Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the
24th.  The meeting should start at 6:45, organized chats will end
around  8:00, and then there shall be discussion and socializing :)

The meeting will be at the University of Washington in
Seattle again. It's in the Computer Science building (not electrical
engineering!), room 303, located here:
http://www.washington.edu/home/maps/southcentral.html?80,70,792,660

If you've ever wanted to learn more about distributed computing, or
just see how other people are innovating with Hadoop, you can't miss
this opportunity. Our focus is on learning and education, so every
presentation must end with a few questions for the group to research
and discuss. (But if you're an introvert, we won't mind).

The format is two or three 15-minute "deep dive" talks, followed by
several 5 minute "lightning chats". We had a few interesting topics
last month:

-Building a Social Media Analysis company on the Apache Cloud Stack
-Cancer detection in images using Hadoop
-Real-time OLAP on HBase -- is it possible?
-Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS
-Custom Ranking in Lucene

We already have one "deep dive" scheduled this month, on truly
scalable Lucene with Katta. If you've been looking for a way to handle
those large Lucene indices, this is a must-attend!

Looking forward to seeing everyone there again.

Cheers,
Bradford

http://www.roadtofailure.com -- The Fringes of Distributed Computing,
Computer Science, and Social Media.


PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle

2009-06-15 Thread Bradford Stephens
Greetings,

On the heels of our smashing success last month, we're going to be
convening the Pacific Northwest (Oregon and Washington)
Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the
24th.  The meeting should start at 6:45, organized chats will end
around  8:00, and then there shall be discussion and socializing :)

The meeting will probably be at the University of Washington in
Seattle again -- a (better) map and directions shall be provided when
the location is confirmed.

If you've ever wanted to learn more about distributed computing, or
just see how other people are innovating with Hadoop, you can't miss
this opportunity. Our focus is on learning and education, so every
presentation must end with a few questions for the group to research
and discuss. (But if you're an introvert, we won't mind).

The format is two or three 15-minute "deep dive" talks, followed by
several 5 minute "lightning chats". We had a few interesting topics
last month:

-Building a Social Media Analysis company on the Apache Cloud Stack
-Cancer detection in images using Hadoop
-Real-time OLAP on HBase -- is it possible?
-Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS
-Custom Ranking in Lucene

We already have one "deep dive" scheduled this month, on truly
scalable Lucene with Katta. If you've been looking for a way to handle
those large Lucene indices, this is a must-attend!

Looking forward to seeing everyone there again.

Cheers,
Bradford

http://www.roadtofailure.com -- The Fringes of Distributed Computing,
Computer Science, and Social Media.


Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bradford Stephens
Sorry, no videos this time. The conversation wasn't very structured... next
month I'll record it :)

On Wed, Jun 3, 2009 at 1:59 PM, Bhupesh Bansal  wrote:

> Great Bradford,
>
> Can you post some videos if you have some ?
>
> Best
> Bhupesh
>
>
>
> On 6/3/09 11:58 AM, "Bradford Stephens" 
> wrote:
>
> > Hey everyone!
> > I just wanted to give a BIG THANKS for everyone who came. We had over a
> > dozen people, and a few got lost at UW :)  [I would have sent this update
> > earlier, but I flew to Florida the day after the meeting].
> >
> > If you didn't come, you missed quite a bit of learning and topics. Such
> as:
> >
> > -Building a Social Media Analysis company on the Apache Cloud Stack
> > -Cancer detection in images using Hadoop
> > -Real-time OLAP
> > -Scalable Lucene using Katta and Hadoop
> > -Video and Network Flow
> > -Custom Ranking in Lucene
> >
> > I'm going to update our wiki with the topics, and a few questions raised
> and
> > the lessons we've learned.
> >
> > The next meetup will be June 24th. Be there, or be... boring :)
> >
> > Cheers,
> > Bradford
> >
> > On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens <
> > bradfordsteph...@gmail.com> wrote:
> >
> >> Greetings,
> >>
> >> Would anybody be willing to join a PNW Hadoop and/or Lucene User Group
> >> with me in the Seattle area? I can donate some facilities, etc. -- I
> >> also always have topics to speak about :)
> >>
> >> Cheers,
> >> Bradford
> >>
>
>


Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bradford Stephens
Hey everyone!
I just wanted to give a BIG THANKS for everyone who came. We had over a
dozen people, and a few got lost at UW :)  [I would have sent this update
earlier, but I flew to Florida the day after the meeting].

If you didn't come, you missed quite a bit of learning and topics. Such as:

-Building a Social Media Analysis company on the Apache Cloud Stack
-Cancer detection in images using Hadoop
-Real-time OLAP
-Scalable Lucene using Katta and Hadoop
-Video and Network Flow
-Custom Ranking in Lucene

I'm going to update our wiki with the topics, and a few questions raised and
the lessons we've learned.

The next meetup will be June 24th. Be there, or be... boring :)

Cheers,
Bradford

On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens <
bradfordsteph...@gmail.com> wrote:

> Greetings,
>
> Would anybody be willing to join a PNW Hadoop and/or Lucene User Group
> with me in the Seattle area? I can donate some facilities, etc. -- I
> also always have topics to speak about :)
>
> Cheers,
> Bradford
>


PNW Hadoop + Apache Cloud Stack Meetup, Wed. May 27th:

2009-05-26 Thread Bradford Stephens
Greetings,
This is a friendly reminder that the 1st meetup for the PNW Hadoop + Apache
Cloud Stack User Group is THIS WEDNESDAY at 6:45pm. We're very excited to
have everyone attend!

University of Washington, Allen Center Room 303, at 6:45pm on Wednesday, May
27, 2009.
I'm going to put together a map, and a wiki so we can collab.

The Allen Center is located here:
http://www.washington.edu/home/maps/?CSE

What I'm envisioning is a meetup for about 2 hours: we'll have two in-depth
talks of 15-20 minutes each, and then several "lightning talks" of 5
minutes. We'll then have discussion and 'social time'.
Let me know if you're interested in speaking or attending.

I'd like to focus on education, so every presentation *needs* to ask some
questions at the end. We can talk about these after the presentations, and
I'll record what we've learned in a wiki and share that with the rest of
us.

Looking forward to meeting you all!


Cheers,
Bradford Stephens


Re: Seattle / PNW Hadoop + Lucene User Group?

2009-05-19 Thread Bradford Stephens
Hello everyone! We (finally) have space secured (it's a tough task!):
University of Washington, Allen Center Room 303, at 6:45pm on Wednesday, May
27, 2009.
I'm going to put together a map, and a wiki so we can collab.

What I'm envisioning is a meetup for about 2 hours: we'll have two in-depth
talks of 15-20 minutes each, and then several "lightning talks" of 5
minutes. We'll then have discussion and 'social time'.
Let me know if you're interested in speaking or attending.

I'd like to focus on education, so every presentation *needs* to ask some
questions at the end. We can talk about these after the presentations, and
I'll record what we've learned in a wiki and share that with the rest of
us.

Looking forward to meeting you all!

Cheers,
Bradford

On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens <
bradfordsteph...@gmail.com> wrote:

> Greetings,
>
> Would anybody be willing to join a PNW Hadoop and/or Lucene User Group
> with me in the Seattle area? I can donate some facilities, etc. -- I
> also always have topics to speak about :)
>
> Cheers,
> Bradford
>


Re: Seattle / PNW Hadoop + Lucene User Group?

2009-04-20 Thread Bradford Stephens
Thanks for the responses, everyone. Where shall we host? My company
can offer space in  our building in Factoria, but it's not exactly a
'cool' or 'fun' place. I can also reserve a room at a local library. I
can bring some beer and light refreshments.

On Mon, Apr 20, 2009 at 7:22 AM, Matthew Hall  wrote:
> Same here, sadly there isn't much call for Lucene user groups in Maine.  It
> would be nice though ^^
>
> Matt
>
> Amin Mohammed-Coleman wrote:
>>
>> I would love to come but I'm afraid I'm stuck in rainy old England :(
>>
>> Amin
>>
>> On 18 Apr 2009, at 01:08, Bradford Stephens 
>> wrote:
>>
>>> OK, we've got 3 people... that's enough for a party? :)
>>>
>>> Surely there must be dozens more of you guys out there... c'mon,
>>> accelerate your knowledge! Join us in Seattle!
>>>
>>>
>>>
>>> On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens
>>>  wrote:
>>>>
>>>> Greetings,
>>>>
>>>> Would anybody be willing to join a PNW Hadoop and/or Lucene User Group
>>>> with me in the Seattle area? I can donate some facilities, etc. -- I
>>>> also always have topics to speak about :)
>>>>
>>>> Cheers,
>>>> Bradford
>>>>
>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Seattle / PNW Hadoop + Lucene User Group?

2009-04-17 Thread Bradford Stephens
OK, we've got 3 people... that's enough for a party? :)

Surely there must be dozens more of you guys out there... c'mon,
accelerate your knowledge! Join us in Seattle!



On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens
 wrote:
> Greetings,
>
> Would anybody be willing to join a PNW Hadoop and/or Lucene User Group
> with me in the Seattle area? I can donate some facilities, etc. -- I
> also always have topics to speak about :)
>
> Cheers,
> Bradford
>


Seattle / PNW Hadoop + Lucene User Group?

2009-04-16 Thread Bradford Stephens
Greetings,

Would anybody be willing to join a PNW Hadoop and/or Lucene User Group
with me in the Seattle area? I can donate some facilities, etc. -- I
also always have topics to speak about :)

Cheers,
Bradford