Re: [jug-discussion] Searching large object graphs

2005-01-03 Thread Bryan . ONeal
Could be fun, I have some free time over the next week and need to get back
into the programming game.  And more importantly I think I could learn a bunch
from other peoples code :)

On Sun, 2 Jan 2005, Erik Hatcher wrote:

> 
> On Jan 2, 2005, at 9:18 PM, Tim Colson wrote:
> 
> >> I'm in if Tim wants
> >> to write a few unit tests that candidate
> >> implementations should turn green.
> > With the holidays and new puppy, I haven't been responding to email as
> > quickly...
> >
> > So Erik/Bryan -- are you gents saying if I code up some dummy objects, 
> > and
> > then some junit tests with pseudo queries like: "all objects with a 
> > name or
> > skill containing java", then you gents would code up some in-memory
> > searches?
> 
> That's what I'm saying!  :)
> 
> I can't promise quick turn-around time it'd depend on how tricky 
> you made your tests.  But the "all objects with a name or skill 
> containing java" one shouldn't take long (on the order of minutes to 
> code given a setUp and testXXX method that already got me the data and 
> expectations).
> 
> Maybe something like this:
> 
>   private ObjectManager om;
>   public void setUp() {
>// create a Collection of objects
> om = new ObjectManager();  // this will be the class I implement - 
> you could mock it to get the test to compile
> om.add(collection);
>   }
> 
>   public void testFindJava() {
> Collection results = om.findNameOrSkillContaining("java");  // how 
> do we phrase the query generically?
> // with a Lucene implementation you could do "name:java OR 
> skill:java"
> 
> // assert whatever you like about the returned objects - should they 
> be in any particular order?
>   }
> 
>   Erik
> 
> >
> > I could be game for that.
> >
> > Tim
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2005-01-02 Thread Erik Hatcher
On Jan 2, 2005, at 9:18 PM, Tim Colson wrote:
I'm in if Tim wants
to write a few unit tests that candidate
implementations should turn green.
With the holidays and new puppy, I haven't been responding to email as
quickly...
So Erik/Bryan -- are you gents saying if I code up some dummy objects, 
and
then some junit tests with pseudo queries like: "all objects with a 
name or
skill containing java", then you gents would code up some in-memory
searches?
That's what I'm saying!  :)
I can't promise quick turn-around time it'd depend on how tricky 
you made your tests.  But the "all objects with a name or skill 
containing java" one shouldn't take long (on the order of minutes to 
code given a setUp and testXXX method that already got me the data and 
expectations).

Maybe something like this:
	private ObjectManager om;
	public void setUp() {
  // create a Collection of objects
	  om = new ObjectManager();  // this will be the class I implement - 
you could mock it to get the test to compile
	  om.add(collection);
	}

	public void testFindJava() {
	  Collection results = om.findNameOrSkillContaining("java");  // how 
do we phrase the query generically?
	  // with a Lucene implementation you could do "name:java OR 
skill:java"

	  // assert whatever you like about the returned objects - should they 
be in any particular order?
	}

Erik
I could be game for that.
Tim
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [jug-discussion] Searching large object graphs

2005-01-02 Thread Tim Colson
> I'm in if Tim wants 
> to write a few unit tests that candidate 
> implementations should turn green.
With the holidays and new puppy, I haven't been responding to email as
quickly...

So Erik/Bryan -- are you gents saying if I code up some dummy objects, and
then some junit tests with pseudo queries like: "all objects with a name or
skill containing java", then you gents would code up some in-memory
searches?

I could be game for that.

Tim

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-30 Thread Bryan . ONeal
Ahh, looking back on it I did read it as a more general problem (their I go
again reading way to much into things)  

Yes, I will agree your suggestion is very good one  


On Thu, 30 Dec 2004, Erik Hatcher wrote:

> 
> On Dec 30, 2004, at 11:58 AM, [EMAIL PROTECTED] wrote:
> 
> > O...  Ok, that seems like fun (I know I am sick, but truth is I 
> > have time
> > to kill at home for next week and a half) But we should also have 
> > different
> > kinds of common data, like a few hundred complete personal records, a 
> > few
> > books/blogs, etc.  We could also see a difference between memory 
> > resident ODB
> > structure and RDB structure.  For implementation time we should also 
> > try one
> > technology we are familiar with and one we are not; as implementation 
> > time is
> > inversely proportional to prior knowledge of the method used to
> > implement.  Perhaps I can get more practice at Lucene.
> 
> You're getting pretty carried away here!  I am after simplicity - 
> meeting what Tim's original question was about, nothing more.  From 
> what you just said, and what you say later, it sounds like you're 
> expanding the requirements dramatically.  I'm in if Tim wants to write 
> a few unit tests that candidate implementations should turn green.
> 
> > Also, am I the only one who has to deal with the Trak Everything 
> > Objects?  I
> > ask because a few hundred tuples in a record is not uncommon.  It is 
> > also not
> > uncommon to have them related to a few dozen other entities each of 
> > which may
> > have 25-50 tuples.  And the users come up with wacky searches like "I 
> > want to
> > know every person who has ever been on a south phoenix construction 
> > project
> > with Tim after he became a lead.  " I know there are some scary smart 
> > people
> > on this list (I am not necessarily on of them) and I would love to see 
> > some
> > good code.
> 
> This vastly changes the landscape.  This sounds like the job for an RDF 
> engine (Kowari is the one I hear the most about).
> 
> I'm not interested in building a mega catch-all kinda in-memory object 
> store.  Tim had one concrete example, and I said Lucene looked perfect 
> for it.  Lucene is awesome, but its not the end solution for every 
> conceivable scenario.  If Tim's use cases are along the lines of the 
> example he provided then I'm up for making whatever unit tests he comes 
> up with pass with a Lucene implementation under the covers.
> 
>   Erik
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-30 Thread Bryan . ONeal
I must also agree.  I will create another example, let us say you need to get
from point A to point B a mile away.  Is it better to walk or drive.  Well
drive of course.  But if you do not have a car, or know how to drive, and can
not wait for a cab then your stuck walking.  It takes ME less time to build a
good enough solution then it doze to learn how to build a best
solution.  Particularly if the issue is complex and must be understood by many
people who will need to modify and maintain the code latter.  That and I have
this really obnoxious drive to keep my code as pure J2EE as possible.  But by
the same token I will use some black box stuff to save time (Reisn), but I
hate to do it.

>From a business side if I give a presentation that says I can give you a
search engine that will return a very good result in 45 seconds for $500.00 or
I can give you something that will return all results in two seconds and
display them in a categorized fashion for $5000.00, they will usually choose
the first solution.

So I am not saying other technologies are bad I am just saying I prefer to
hand code searches for the large OODB I deal with.  And there is no way I
could beat Lucene, I just could implement something faster by hand then to
lean and implement Lucene.  Now if I had to do a fresh implementation more
often, or had need for its power, then I would make the investment.  However,
for the occasional search though a few, large, memory resident objects I could
do it with an acceptable speed variance.  Also Richard said he did not have
time to read my email, so I find it odd that he would have time to learn a
whole new tech?  Their by going back to my J2EE code by hand thing being
easier 

But then again I like hand coding J2EE and I wish I could learn more about
really hard core system stuff like what else can I do with the robot, or how
can I get system idle times, or capture a desktop, etc.  But so far all I have
found are J-C++ black box hacks that are seriously system dependent and I do
not care to mess with that (Again not say Lucene or any other for mention tech
is that way, just to show where I am coming from and that I like hand coding).  

And since I some how feel this is becoming a flame (could just bee the cold
fogy morning that makes me feel that way)  I will back off, as my suggestion
does not seem to have been taken as some words from the devil.  SO I apologize, If I
misunderstood the original question and passed a solution that you all think
is really off the wall.  Which is why I offered a really short "Hey, I think I
have this wrong, so here is the short version of what I feel answer".

Well back to work :) 
Accounting not programming :)


I shall keep my novice opinons top my self for a while :)

On Thu, 30 Dec 2004, Drew Davidson wrote:

> Richard Hightower wrote:
> 
> >I agree. But what best are you talking about. The best technical solution or
> >the best business solution. The best business solution is not always the
> >best technical solution.
> >
> >(Mounting high horse...) Engineering is about tradeoffs: budget, time,
> >beer... Actually I just threw beer in there for fun.
> >
> >I will continue to focus on good enough technical solution to fit the
> >customers need.
> >
> >Actually, I will continue to play with technology that I am interested in
> >and telling the customer it is the best business solution (just kidding). I
> >will bile all technology I don't understand (if I don't understand it... How
> >can it be good?) Sorry I was channeling the bile blog :)
> >  
> >
> I interpret "best" to mean the most comprehensive, extensible solution 
> possible.  "Good" is therefore something that works reasonably well for 
> the purpose to which you are putting it.  Simple as possible but no simpler.
> 
> - Drew
> 
> -- 
> +-+
> < Drew Davidson | OGNL Technology >
> +-+
> |  Email: [EMAIL PROTECTED]  /
> |Web: http://www.ognl.org   /
> |Vox: (520) 531-1966   <
> |Fax: (520) 531-1965\
> | Mobile: (520) 405-2967 \
> +-+
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-30 Thread Erik Hatcher
On Dec 30, 2004, at 11:58 AM, [EMAIL PROTECTED] wrote:
O...  Ok, that seems like fun (I know I am sick, but truth is I 
have time
to kill at home for next week and a half) But we should also have 
different
kinds of common data, like a few hundred complete personal records, a 
few
books/blogs, etc.  We could also see a difference between memory 
resident ODB
structure and RDB structure.  For implementation time we should also 
try one
technology we are familiar with and one we are not; as implementation 
time is
inversely proportional to prior knowledge of the method used to
implement.  Perhaps I can get more practice at Lucene.
You're getting pretty carried away here!  I am after simplicity - 
meeting what Tim's original question was about, nothing more.  From 
what you just said, and what you say later, it sounds like you're 
expanding the requirements dramatically.  I'm in if Tim wants to write 
a few unit tests that candidate implementations should turn green.

Also, am I the only one who has to deal with the Trak Everything 
Objects?  I
ask because a few hundred tuples in a record is not uncommon.  It is 
also not
uncommon to have them related to a few dozen other entities each of 
which may
have 25-50 tuples.  And the users come up with wacky searches like "I 
want to
know every person who has ever been on a south phoenix construction 
project
with Tim after he became a lead.  " I know there are some scary smart 
people
on this list (I am not necessarily on of them) and I would love to see 
some
good code.
This vastly changes the landscape.  This sounds like the job for an RDF 
engine (Kowari is the one I hear the most about).

I'm not interested in building a mega catch-all kinda in-memory object 
store.  Tim had one concrete example, and I said Lucene looked perfect 
for it.  Lucene is awesome, but its not the end solution for every 
conceivable scenario.  If Tim's use cases are along the lines of the 
example he provided then I'm up for making whatever unit tests he comes 
up with pass with a Lucene implementation under the covers.

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [jug-discussion] Searching large object graphs

2004-12-30 Thread Bryan . ONeal
O...  Ok, that seems like fun (I know I am sick, but truth is I have time
to kill at home for next week and a half) But we should also have different
kinds of common data, like a few hundred complete personal records, a few
books/blogs, etc.  We could also see a difference between memory resident ODB
structure and RDB structure.  For implementation time we should also try one
technology we are familiar with and one we are not; as implementation time is
inversely proportional to prior knowledge of the method used to
implement.  Perhaps I can get more practice at Lucene.

If we get a half dozen of us and share our code it could be quite the learning
experience as we find out what methods are comparable, what searches methods
are best on what kind of data, is their a good multiple use tech that performs
reasonably well on many kinds etc.   :)

I'm down :)


Also, am I the only one who has to deal with the Trak Everything Objects?  I
ask because a few hundred tuples in a record is not uncommon.  It is also not
uncommon to have them related to a few dozen other entities each of which may
have 25-50 tuples.  And the users come up with wacky searches like "I want to
know every person who has ever been on a south phoenix construction project
with Tim after he became a lead.  " I know there are some scary smart people
on this list (I am not necessarily on of them) and I would love to see some
good code.


On Wed, 29 Dec 2004, Erik Hatcher wrote:

> 
> On Dec 29, 2004, at 5:06 PM, [EMAIL PROTECTED] wrote:
> > 3) Lucene is a very good system IF you have the kind of loose data it 
> > is coded
> > for.  However if you have tight objects the overhead it spends in 
> > organizing
> > its search is wasted.  So, if you 100K object is, say, a book with a 
> > half
> > dozen attributes all containing similar data types, then you fine.  If 
> > however
> > your 100k object is a development project with 250 attributes of mixed 
> > data
> > types, then it is not so good.
> 
> Structured vs. unstructured searching is a very interesting topic.  
> XQuery is well worth consideration here.
> 
> I've found in the work I do that folks talk about true structured 
> search, but when it comes to designing a search interface it becomes 
> vastly more complex for users to comprehend how to formulate XPath or 
> XQuery-like queries when what they really want to do is type in a 
> couple of words and have the software present them with the best 
> matches first.  There is no doubt that indexing 250 different fields in 
> Lucene is way extreme and not how it should be used.  But again, this 
> is generally not what folks *really* want.
> 
> For the example that Tim provided, I still biasedly recommend Lucene :)
> 
> In fact, should we have a head-to-head competition to implement 
> different techniques?  We could rate each implementation on 1) How long 
> it took to implement 2) How fast the searches are.   All we need is Tim 
> to write up some unit tests that we can each work on making pass, 
> including some JUnitPerf tests.  I'm game.
> 
>   Erik
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-30 Thread Ollie
Rick and Drew have it right, a long time ago in a galaxy far away, I attended 
the Xerox Professional selling course and the one thing important I learned was 
that people make the buying decision on three things, (1) the first solution 
they find for a problem, (2) the lowest percieved risk solution to their 
problem or (3) the “best” solution to their problem and in that order. 

The higher the cost the higher the percieved risk, the newer the higher the 
percieved risk, etc. 

If you pull a lot of doors and have low prices you win a lot of business and 
best doesn't matter. 

Ollie

-Original Message-
From: "Richard Hightower" <[EMAIL PROTECTED]>
Date: Wed, 29 Dec 2004 20:48:50 
To:
Subject: RE: [jug-discussion] Searching large object graphs

I agree. But what best are you talking about. The best technical solution or
the best business solution. The best business solution is not always the
best technical solution.

(Mounting high horse...) Engineering is about tradeoffs: budget, time,
beer... Actually I just threw beer in there for fun.

I will continue to focus on good enough technical solution to fit the
customers need.

Actually, I will continue to play with technology that I am interested in
and telling the customer it is the best business solution (just kidding). I
will bile all technology I don't understand (if I don't understand it... How
can it be good?) Sorry I was channeling the bile blog :)

BTW Did I mention that OGNL Rocks?!

DREW ROCKS! 

I got to get back to work! Later.

On a lighter note my wireless keyboard and mouse went south I got the
new Microsoft one with all of the bells and whistles. It works, and my
keyboard has 25 extra keys Oh well... It won't make me type faster or
procrastinate any less.



-Original Message-
From: Drew Davidson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 10:23 PM
To: jug-discussion@tucson-jug.org
Subject: Re: [jug-discussion] Searching large object graphs

Richard Hightower wrote:

>I agree with Erik. I don't have time to read your long email let alone 
>implement a full-text search engine. I can't think of a single client 
>that would rather have me beat my laptop with a rock, then rent a 
>pneumatic hammer and destroy it in several efficient seconds.
>
>  
>
The best is the enemy of the good. 

Words to live by in contracting.

>On a lighter note I just learned all about DocBook. And More 
>importantly, I've got my wireless signal going all the way to my 
>mobile-mini office.
>
>Belkin Pre-N Wireless Router covers my whole 5 acre lot with a strong 
>signal with a lot of bandwidth. My laptop can pick up a signal on the 
>complete 5 acres with its new Pre-N Wireless NIC. Belkin rocks Linksys
stinks.
>  
>
On a related note, Rick is now in the process of growing a second head
because of the increased signal strength.

- Drew

-- 
+-+
< Drew Davidson | OGNL Technology >
+-+
|  Email: [EMAIL PROTECTED]  /
|Web: http://www.ognl.org   /
|Vox: (520) 531-1966   <
|Fax: (520) 531-1965\
| Mobile: (520) 405-2967 \
+-+


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Mike Oliver
CTO, Alarius Systems LLC
Las Vegas, Nevada USA

Sent using my BlackBerry 6510 from Nextel


Re: [jug-discussion] Searching large object graphs

2004-12-30 Thread Drew Davidson
Richard Hightower wrote:
I agree. But what best are you talking about. The best technical solution or
the best business solution. The best business solution is not always the
best technical solution.
(Mounting high horse...) Engineering is about tradeoffs: budget, time,
beer... Actually I just threw beer in there for fun.
I will continue to focus on good enough technical solution to fit the
customers need.
Actually, I will continue to play with technology that I am interested in
and telling the customer it is the best business solution (just kidding). I
will bile all technology I don't understand (if I don't understand it... How
can it be good?) Sorry I was channeling the bile blog :)
 

I interpret "best" to mean the most comprehensive, extensible solution 
possible.  "Good" is therefore something that works reasonably well for 
the purpose to which you are putting it.  Simple as possible but no simpler.

- Drew
--
+-+
< Drew Davidson | OGNL Technology >
+-+
|  Email: [EMAIL PROTECTED]  /
|Web: http://www.ognl.org   /
|Vox: (520) 531-1966   <
|Fax: (520) 531-1965\
| Mobile: (520) 405-2967 \
+-+
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Richard Hightower
"The best is the enemy of the good. "

U... E I just realized that you were agreeing with me. Scratch
almost everything I said 

DOH!

I am lezdexic.


-Original Message-
From: Drew Davidson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 10:23 PM
To: jug-discussion@tucson-jug.org
Subject: Re: [jug-discussion] Searching large object graphs

Richard Hightower wrote:

>I agree with Erik. I don't have time to read your long email let alone 
>implement a full-text search engine. I can't think of a single client 
>that would rather have me beat my laptop with a rock, then rent a 
>pneumatic hammer and destroy it in several efficient seconds.
>
>  
>
The best is the enemy of the good. 

Words to live by in contracting.

>On a lighter note I just learned all about DocBook. And More 
>importantly, I've got my wireless signal going all the way to my 
>mobile-mini office.
>
>Belkin Pre-N Wireless Router covers my whole 5 acre lot with a strong 
>signal with a lot of bandwidth. My laptop can pick up a signal on the 
>complete 5 acres with its new Pre-N Wireless NIC. Belkin rocks Linksys
stinks.
>  
>
On a related note, Rick is now in the process of growing a second head
because of the increased signal strength.

- Drew

-- 
+-+
< Drew Davidson | OGNL Technology >
+-+
|  Email: [EMAIL PROTECTED]  /
|Web: http://www.ognl.org   /
|Vox: (520) 531-1966   <
|Fax: (520) 531-1965\
| Mobile: (520) 405-2967 \
+-+


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Richard Hightower
I agree. But what best are you talking about. The best technical solution or
the best business solution. The best business solution is not always the
best technical solution.

(Mounting high horse...) Engineering is about tradeoffs: budget, time,
beer... Actually I just threw beer in there for fun.

I will continue to focus on good enough technical solution to fit the
customers need.

Actually, I will continue to play with technology that I am interested in
and telling the customer it is the best business solution (just kidding). I
will bile all technology I don't understand (if I don't understand it... How
can it be good?) Sorry I was channeling the bile blog :)

BTW Did I mention that OGNL Rocks?!

DREW ROCKS! 

I got to get back to work! Later.

On a lighter note my wireless keyboard and mouse went south I got the
new Microsoft one with all of the bells and whistles. It works, and my
keyboard has 25 extra keys Oh well... It won't make me type faster or
procrastinate any less.



-Original Message-
From: Drew Davidson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 10:23 PM
To: jug-discussion@tucson-jug.org
Subject: Re: [jug-discussion] Searching large object graphs

Richard Hightower wrote:

>I agree with Erik. I don't have time to read your long email let alone 
>implement a full-text search engine. I can't think of a single client 
>that would rather have me beat my laptop with a rock, then rent a 
>pneumatic hammer and destroy it in several efficient seconds.
>
>  
>
The best is the enemy of the good. 

Words to live by in contracting.

>On a lighter note I just learned all about DocBook. And More 
>importantly, I've got my wireless signal going all the way to my 
>mobile-mini office.
>
>Belkin Pre-N Wireless Router covers my whole 5 acre lot with a strong 
>signal with a lot of bandwidth. My laptop can pick up a signal on the 
>complete 5 acres with its new Pre-N Wireless NIC. Belkin rocks Linksys
stinks.
>  
>
On a related note, Rick is now in the process of growing a second head
because of the increased signal strength.

- Drew

-- 
+-+
< Drew Davidson | OGNL Technology >
+-+
|  Email: [EMAIL PROTECTED]  /
|Web: http://www.ognl.org   /
|Vox: (520) 531-1966   <
|Fax: (520) 531-1965\
| Mobile: (520) 405-2967 \
+-+


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-29 Thread Drew Davidson
Richard Hightower wrote:
I agree with Erik. I don't have time to read your long email let alone
implement a full-text search engine. I can't think of a single client that
would rather have me beat my laptop with a rock, then rent a pneumatic
hammer and destroy it in several efficient seconds.
 

The best is the enemy of the good. 

Words to live by in contracting.
On a lighter note I just learned all about DocBook. And More
importantly, I've got my wireless signal going all the way to my mobile-mini
office.
Belkin Pre-N Wireless Router covers my whole 5 acre lot with a strong signal
with a lot of bandwidth. My laptop can pick up a signal on the complete 5
acres with its new Pre-N Wireless NIC. Belkin rocks Linksys stinks. 
 

On a related note, Rick is now in the process of growing a second head 
because of the increased signal strength.

- Drew
--
+-+
< Drew Davidson | OGNL Technology >
+-+
|  Email: [EMAIL PROTECTED]  /
|Web: http://www.ognl.org   /
|Vox: (520) 531-1966   <
|Fax: (520) 531-1965\
| Mobile: (520) 405-2967 \
+-+
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Richard Hightower
Re:
"It seems easier to re-invent a full-text search engine?  I'd be way
impressed if you could beat Lucene!"

I agree with Erik. I don't have time to read your long email let alone
implement a full-text search engine. I can't think of a single client that
would rather have me beat my laptop with a rock, then rent a pneumatic
hammer and destroy it in several efficient seconds.

On a lighter note I just learned all about DocBook. And More
importantly, I've got my wireless signal going all the way to my mobile-mini
office.

Belkin Pre-N Wireless Router covers my whole 5 acre lot with a strong signal
with a lot of bandwidth. My laptop can pick up a signal on the complete 5
acres with its new Pre-N Wireless NIC. Belkin rocks Linksys stinks. 

I just remembered that Cisco is a client of mine... Hmmm Linksys is not
as good. I am sure it will be better in the next release How is that?


-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 8:41 PM
To: jug-discussion@tucson-jug.org
Subject: Re: [jug-discussion] Searching large object graphs


On Dec 29, 2004, at 3:12 PM, [EMAIL PROTECTED] wrote:
> Not to be Trite... But why not just use bean objects to a backend DB.  
> Or for
> that matter hand write the old incremental sort and sorted search 
> routines.  If it is all in memory then you should be able hand write 
> an index system capable of running through thousands of records in a 
> fraction of a second...  Just seems easier...

It seems easier to re-invent a full-text search engine?  I'd be way
impressed if you could beat Lucene!

Given the example query Tim provided, you'd be able to do this using Lucene
in only a handful of lines of code.

Erik



> On Thu, 23 Dec 2004, Erik Hatcher wrote:
>
>> Lucene
>>
>> The query would be this "name:olson OR email:olson" if you indexed 
>> that information into separate fields.  A common technique is to 
>> index all data you want queryable also into an aggregate field in 
>> which case the query could simply be "olson".
>>
>> The full source code to Lucene in Action is at
>> http://www.manning.com/hatcher2 - the ebook is available.  The 
>> physical book is shipping from the printers as we speak (UPS tracking 
>> says I should have gotten my batch yesterday, but it'll be today it 
>> seems).
>> http://www.lucenebook.com will go live within the week searching
>> *inside* the book as well as a blog system I'm setting up.
>>
>>  Erik
>>
>> On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
>>
>>> So just assume for a moment that RAM is cheap and you decided to 
>>> load 100K objects into memory. Assume those objects were 
>>> "Employees"... you can imagine the fields would be the usual 
>>> suspects. Assume each employee is associated with a profile that is 
>>> another object, which is composed of a bunch of other data objects.
>>>
>>> What would you use to find/select objects like "Name or email foo 
>>> matches
>>> *olson* " ?
>>>
>>> Some possibilities:
>>> http://jakarta.apache.org/commons/jxpath/
>>>
>>> Some of the stuff inside Commons:
>>> http://jakarta.apache.org/commons/collections/
>>>
>>> Lucene indexes
>>> http://jakarta.apache.org/lucene/docs/
>>>
>>>
>>> Others?
>>>
>>> Tim
>>>
>>> 
>>> - To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-29 Thread Erik Hatcher
On Dec 29, 2004, at 5:06 PM, [EMAIL PROTECTED] wrote:
3) Lucene is a very good system IF you have the kind of loose data it 
is coded
for.  However if you have tight objects the overhead it spends in 
organizing
its search is wasted.  So, if you 100K object is, say, a book with a 
half
dozen attributes all containing similar data types, then you fine.  If 
however
your 100k object is a development project with 250 attributes of mixed 
data
types, then it is not so good.
Structured vs. unstructured searching is a very interesting topic.  
XQuery is well worth consideration here.

I've found in the work I do that folks talk about true structured 
search, but when it comes to designing a search interface it becomes 
vastly more complex for users to comprehend how to formulate XPath or 
XQuery-like queries when what they really want to do is type in a 
couple of words and have the software present them with the best 
matches first.  There is no doubt that indexing 250 different fields in 
Lucene is way extreme and not how it should be used.  But again, this 
is generally not what folks *really* want.

For the example that Tim provided, I still biasedly recommend Lucene :)
In fact, should we have a head-to-head competition to implement 
different techniques?  We could rate each implementation on 1) How long 
it took to implement 2) How fast the searches are.   All we need is Tim 
to write up some unit tests that we can each work on making pass, 
including some JUnitPerf tests.  I'm game.

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [jug-discussion] Searching large object graphs

2004-12-29 Thread Erik Hatcher
On Dec 29, 2004, at 3:12 PM, [EMAIL PROTECTED] wrote:
Not to be Trite... But why not just use bean objects to a backend DB.  
Or for
that matter hand write the old incremental sort and sorted search
routines.  If it is all in memory then you should be able hand write 
an index
system capable of running through thousands of records in a fraction 
of a
second...  Just seems easier...
It seems easier to re-invent a full-text search engine?  I'd be way 
impressed if you could beat Lucene!

Given the example query Tim provided, you'd be able to do this using 
Lucene in only a handful of lines of code.

Erik

On Thu, 23 Dec 2004, Erik Hatcher wrote:
Lucene
The query would be this "name:olson OR email:olson" if you indexed 
that
information into separate fields.  A common technique is to index all
data you want queryable also into an aggregate field in which case the
query could simply be "olson".

The full source code to Lucene in Action is at
http://www.manning.com/hatcher2 - the ebook is available.  The 
physical
book is shipping from the printers as we speak (UPS tracking says I
should have gotten my batch yesterday, but it'll be today it seems).
http://www.lucenebook.com will go live within the week searching
*inside* the book as well as a blog system I'm setting up.

Erik
On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
So just assume for a moment that RAM is cheap and you decided to load
100K
objects into memory. Assume those objects were "Employees"... you can
imagine the fields would be the usual suspects. Assume each employee 
is
associated with a profile that is another object, which is composed 
of
a
bunch of other data objects.

What would you use to find/select objects like "Name or email foo
matches
*olson* " ?
Some possibilities:
http://jakarta.apache.org/commons/jxpath/
Some of the stuff inside Commons:
http://jakarta.apache.org/commons/collections/
Lucene indexes
http://jakarta.apache.org/lucene/docs/
Others?
Tim
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [jug-discussion] Searching large object graphs

2004-12-29 Thread Bryan . ONeal
Yha I need to write my own meta data to the files and then retrieve it to
perform searches.  Thanks for the info I will check out your source and the
geocity site this weekend and go from their.  If I figure something I will
post the answer :)

Yha! Searchable image archives for every one!



On Wed, 29 Dec 2004, Drew Davidson wrote:

> [EMAIL PROTECTED] wrote:
> 
> >While we are on the subject, I am looking for a more standard way of
> >incorporate meta data into a JPG (Currently I do a preparatory insert in the
> >JPG code and do a search on the whole code using tricks similar to those in
> >Lucene, however this is hardly ideal)  
> >Any Jpeg people out their?
> >  
> >
> JPEG has EXIF metadata for storing boatloads of information about the 
> image; I believe (not exactly sure, though) that you can put your own 
> custom information in there.  Is that what you are asking?
> 
> Java library for extraction of standard metadata:
> 
> http://www.drewnoakes.com/code/exif/
> 
> As far as writing the JPEG out with metadata, I don't have any resources 
> right off the top of my head.
> 
> A good reference for stuff about JPEG:
> 
> http://www.geocities.com/marcoschmidt.geo/jpeg-image-file-format.html
> 
> And, of course an excellent reference to everything you ever wanted to 
> know about JPEG, EXIF metadata and the like:
> 
> http://www.google.com
> 
> :-)
> 
> - Drew
> 
> -- 
> +-+
> < Drew Davidson | OGNL Technology >
> +-+
> |  Email: [EMAIL PROTECTED]  /
> |Web: http://www.ognl.org   /
> |Vox: (520) 531-1966   <
> |Fax: (520) 531-1965\
> | Mobile: (520) 405-2967 \
> +-+
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-29 Thread Drew Davidson
[EMAIL PROTECTED] wrote:
While we are on the subject, I am looking for a more standard way of
incorporate meta data into a JPG (Currently I do a preparatory insert in the
JPG code and do a search on the whole code using tricks similar to those in
Lucene, however this is hardly ideal)  
Any Jpeg people out their?
 

JPEG has EXIF metadata for storing boatloads of information about the 
image; I believe (not exactly sure, though) that you can put your own 
custom information in there.  Is that what you are asking?

Java library for extraction of standard metadata:
   http://www.drewnoakes.com/code/exif/
As far as writing the JPEG out with metadata, I don't have any resources 
right off the top of my head.

A good reference for stuff about JPEG:
   http://www.geocities.com/marcoschmidt.geo/jpeg-image-file-format.html
And, of course an excellent reference to everything you ever wanted to 
know about JPEG, EXIF metadata and the like:

   http://www.google.com
:-)
- Drew
--
+-+
< Drew Davidson | OGNL Technology >
+-+
|  Email: [EMAIL PROTECTED]  /
|Web: http://www.ognl.org   /
|Vox: (520) 531-1966   <
|Fax: (520) 531-1965\
| Mobile: (520) 405-2967 \
+-+
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Bryan . ONeal
Hmmm Ok, I like to pose hypothetical questions as well so no big. As for -
Waste of time re inventing the wheel!- you do realize the Wheel it has been
reinvented Many, many times to suite ever changing needs of the vehicles that
use them rite :)  If you don't believe be do a patient search ;)

It is true my short answer did not provide enough detail to interest you, so I
will attempt to explain more fully.
1) Multi object search criteria are hardly time consuming to write.  Yes, even
fully modular ones that allows multiple object searches.  If your code exceeds
a few thousand lines in total your not allowing Java to do the work.  Just
pull the meta data from you object and construct, same as you would do if you
where trying to create an SQL query.  I personally like to construct open
vectors of anonymous objects and do simple condition checking to employment
loose syntax.  Thus not only can you use wild cards in the actual data search,
but also on the portions of the objects being searched.  This is a nice
approach because the calling function needs to know only that some thing like
what they want exists some ware in the object.  But also alloys you to stream
past most of the condition checking if you know more about the object.  And it
all can be implemented in a black box format so the next code down the line
does not have to understand the "Wheel" in order to use it.
2) If the issue is one of few big objects requiring a fast search then you
actually loose efficiency with many forms of indexing and B-Tree optimization
of the objects and corresponding indexes is a huge waste of overhead.  In
which case I would again use one of a variety of simple sort searches.  Which
ones and how to implement them of course depends on how many objects you are
looking at and how often the data changes (do they still teach this stuff in
programming 101?).  I have actually written a number of good sub systems for
use in regional nodes.  Like those used in very large state/national
production oriented DBs.  (If your wondering why, it was because bandwidth is
still an issue and each region typically uses on the order of 95% regional
level data, but every one still needs to be able to access every thing at all
times without knowing where it originated)
3) Lucene is a very good system IF you have the kind of loose data it is coded
for.  However if you have tight objects the overhead it spends in organizing
its search is wasted.  So, if you 100K object is, say, a book with a half
dozen attributes all containing similar data types, then you fine.  If however
your 100k object is a development project with 250 attributes of mixed data
types, then it is not so good.  This is why I suggested hand codeing for your
intended purpose.  This way such pertinent questions like overall object
structure, relation dependencies, data types, size of each attribute, etc, can
be taken into account.  BTW I do this a lot as a freelance DB consultant.  

While we are on the subject, I am looking for a more standard way of
incorporate meta data into a JPG (Currently I do a preparatory insert in the
JPG code and do a search on the whole code using tricks similar to those in
Lucene, however this is hardly ideal)  
Any Jpeg people out their?


On Wed, 29 Dec 2004, Tim Colson wrote:

> >But why not just use bean objects to a backend DB.  
> 
> Well, howabout because I explicitly posed the question as "just assume for a
> moment that RAM is cheap and you decided to load 100K objects into memory"
> instead of "I have a lot of data...what kind of thingy should I store it
> in... oh, and please reply with small words because I am developmentally
> challenged." 
> 
> Maybe that's why. ;-)
> 
> > that matter hand write the old incremental sort and sorted search
> > routines.  
> 
> Apparently I wasn't clear -- I want to search using multiple criteria with
> wildcards/booleans on multiple fields, and on data in contained objects. 
> 
> Mostly I'd rather not waste time re-inventing wheels, and [usually] the
> folks on the list provide interesting food for thought. 
> 
> I won't bother with the flame-bait about "overly complex" and airguns.
> 
> Cheers,
> Tim
> 
> 
> > On Thu, 23 Dec 2004, Erik Hatcher wrote:
> > 
> > > Lucene
> > > 
> > > The query would be this "name:olson OR email:olson" if you 
> > indexed that 
> > > information into separate fields.  A common technique is to 
> > index all 
> > > data you want queryable also into an aggregate field in 
> > which case the 
> > > query could simply be "olson".
> > > 
> > > The full source code to Lucene in Action is at 
> > > http://www.manning.com/hatcher2 - the ebook is available.  
> > The physical 
> > > book is shipping from the printers as we speak (UPS tracking says I 
> > > should have gotten my batch yesterday, but it'll be today 
> > it seems).  
> > > http://www.lucenebook.com will go live within the week searching 
> > > *inside* the book as well as a blog system I'm setting up.
> > > 
> > >   

RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Michael Oliver
Or "Trust me..."

Michael Oliver
CTO
Alarius Systems LLC
3325 N. Nellis Blvd, #1
Las Vegas, NV 89115
Phone:(702)643-7425
Fax:(520)844-1036
*Note new email changed from [EMAIL PROTECTED]

-Original Message-
From: Richard Hightower [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 11:28 AM
To: jug-discussion@tucson-jug.org
Subject: RE: [jug-discussion] Searching large object graphs

Beware of any email that begin with the words "Not to be Trite...". You
can
feel a big wall of Trite flame coming around the corner. :)


-Original Message-
From: Tim Colson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 4:15 PM
To: jug-discussion@tucson-jug.org
Subject: RE: [jug-discussion] Searching large object graphs

>But why not just use bean objects to a backend DB.  

Well, howabout because I explicitly posed the question as "just assume
for a
moment that RAM is cheap and you decided to load 100K objects into
memory"
instead of "I have a lot of data...what kind of thingy should I store it
in... oh, and please reply with small words because I am developmentally
challenged." 

Maybe that's why. ;-)

> that matter hand write the old incremental sort and sorted search 
> routines.

Apparently I wasn't clear -- I want to search using multiple criteria
with
wildcards/booleans on multiple fields, and on data in contained objects.


Mostly I'd rather not waste time re-inventing wheels, and [usually] the
folks on the list provide interesting food for thought. 

I won't bother with the flame-bait about "overly complex" and airguns.

Cheers,
Tim


> On Thu, 23 Dec 2004, Erik Hatcher wrote:
> 
> > Lucene
> > 
> > The query would be this "name:olson OR email:olson" if you
> indexed that
> > information into separate fields.  A common technique is to
> index all
> > data you want queryable also into an aggregate field in
> which case the
> > query could simply be "olson".
> > 
> > The full source code to Lucene in Action is at
> > http://www.manning.com/hatcher2 - the ebook is available.  
> The physical
> > book is shipping from the printers as we speak (UPS tracking says I 
> > should have gotten my batch yesterday, but it'll be today
> it seems).  
> > http://www.lucenebook.com will go live within the week searching
> > *inside* the book as well as a blog system I'm setting up.
> > 
> > Erik
> > 
> > On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
> > 
> > > So just assume for a moment that RAM is cheap and you
> decided to load
> > > 100K
> > > objects into memory. Assume those objects were
> "Employees"... you can
> > > imagine the fields would be the usual suspects. Assume
> each employee is
> > > associated with a profile that is another object, which
> is composed of
> > > a
> > > bunch of other data objects.
> > >
> > > What would you use to find/select objects like "Name or email foo 
> > > matches
> > > *olson* " ?
> > >
> > > Some possibilities:
> > > http://jakarta.apache.org/commons/jxpath/
> > >
> > > Some of the stuff inside Commons:
> > > http://jakarta.apache.org/commons/collections/
> > >
> > > Lucene indexes
> > > http://jakarta.apache.org/lucene/docs/
> > >
> > >
> > > Others?
> > >
> > > Tim
> > >
> > > 
> -
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: 
> [EMAIL PROTECTED]
> > 
> > 
> > 
> -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Richard Hightower
Beware of any email that begin with the words "Not to be Trite...". You can
feel a big wall of Trite flame coming around the corner. :)


-Original Message-
From: Tim Colson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 4:15 PM
To: jug-discussion@tucson-jug.org
Subject: RE: [jug-discussion] Searching large object graphs

>But why not just use bean objects to a backend DB.  

Well, howabout because I explicitly posed the question as "just assume for a
moment that RAM is cheap and you decided to load 100K objects into memory"
instead of "I have a lot of data...what kind of thingy should I store it
in... oh, and please reply with small words because I am developmentally
challenged." 

Maybe that's why. ;-)

> that matter hand write the old incremental sort and sorted search 
> routines.

Apparently I wasn't clear -- I want to search using multiple criteria with
wildcards/booleans on multiple fields, and on data in contained objects. 

Mostly I'd rather not waste time re-inventing wheels, and [usually] the
folks on the list provide interesting food for thought. 

I won't bother with the flame-bait about "overly complex" and airguns.

Cheers,
Tim


> On Thu, 23 Dec 2004, Erik Hatcher wrote:
> 
> > Lucene
> > 
> > The query would be this "name:olson OR email:olson" if you
> indexed that
> > information into separate fields.  A common technique is to
> index all
> > data you want queryable also into an aggregate field in
> which case the
> > query could simply be "olson".
> > 
> > The full source code to Lucene in Action is at
> > http://www.manning.com/hatcher2 - the ebook is available.  
> The physical
> > book is shipping from the printers as we speak (UPS tracking says I 
> > should have gotten my batch yesterday, but it'll be today
> it seems).  
> > http://www.lucenebook.com will go live within the week searching
> > *inside* the book as well as a blog system I'm setting up.
> > 
> > Erik
> > 
> > On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
> > 
> > > So just assume for a moment that RAM is cheap and you
> decided to load
> > > 100K
> > > objects into memory. Assume those objects were
> "Employees"... you can
> > > imagine the fields would be the usual suspects. Assume
> each employee is
> > > associated with a profile that is another object, which
> is composed of
> > > a
> > > bunch of other data objects.
> > >
> > > What would you use to find/select objects like "Name or email foo 
> > > matches
> > > *olson* " ?
> > >
> > > Some possibilities:
> > > http://jakarta.apache.org/commons/jxpath/
> > >
> > > Some of the stuff inside Commons:
> > > http://jakarta.apache.org/commons/collections/
> > >
> > > Lucene indexes
> > > http://jakarta.apache.org/lucene/docs/
> > >
> > >
> > > Others?
> > >
> > > Tim
> > >
> > > 
> -
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: 
> [EMAIL PROTECTED]
> > 
> > 
> > 
> -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Michael Oliver
Oh my, I must have heard differently, I knew you were challenged and it
had something to do with small, but I was way off base...;-)

Michael Oliver
CTO
Alarius Systems LLC
3325 N. Nellis Blvd, #1
Las Vegas, NV 89115
Phone:(702)643-7425
Fax:(520)844-1036
*Note new email changed from [EMAIL PROTECTED]

-Original Message-
From: Tim Colson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 1:15 PM
To: jug-discussion@tucson-jug.org
Subject: RE: [jug-discussion] Searching large object graphs

>But why not just use bean objects to a backend DB.  

Well, howabout because I explicitly posed the question as "just assume
for a
moment that RAM is cheap and you decided to load 100K objects into
memory"
instead of "I have a lot of data...what kind of thingy should I store it
in... oh, and please reply with small words because I am developmentally
challenged." 

Maybe that's why. ;-)

> that matter hand write the old incremental sort and sorted search
> routines.  

Apparently I wasn't clear -- I want to search using multiple criteria
with
wildcards/booleans on multiple fields, and on data in contained objects.


Mostly I'd rather not waste time re-inventing wheels, and [usually] the
folks on the list provide interesting food for thought. 

I won't bother with the flame-bait about "overly complex" and airguns.

Cheers,
Tim


> On Thu, 23 Dec 2004, Erik Hatcher wrote:
> 
> > Lucene
> > 
> > The query would be this "name:olson OR email:olson" if you 
> indexed that 
> > information into separate fields.  A common technique is to 
> index all 
> > data you want queryable also into an aggregate field in 
> which case the 
> > query could simply be "olson".
> > 
> > The full source code to Lucene in Action is at 
> > http://www.manning.com/hatcher2 - the ebook is available.  
> The physical 
> > book is shipping from the printers as we speak (UPS tracking says I 
> > should have gotten my batch yesterday, but it'll be today 
> it seems).  
> > http://www.lucenebook.com will go live within the week searching 
> > *inside* the book as well as a blog system I'm setting up.
> > 
> > Erik
> > 
> > On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
> > 
> > > So just assume for a moment that RAM is cheap and you 
> decided to load 
> > > 100K
> > > objects into memory. Assume those objects were 
> "Employees"... you can
> > > imagine the fields would be the usual suspects. Assume 
> each employee is
> > > associated with a profile that is another object, which 
> is composed of 
> > > a
> > > bunch of other data objects.
> > >
> > > What would you use to find/select objects like "Name or email foo 
> > > matches
> > > *olson* " ?
> > >
> > > Some possibilities:
> > > http://jakarta.apache.org/commons/jxpath/
> > >
> > > Some of the stuff inside Commons:
> > > http://jakarta.apache.org/commons/collections/
> > >
> > > Lucene indexes
> > > http://jakarta.apache.org/lucene/docs/
> > >
> > >
> > > Others?
> > >
> > > Tim
> > >
> > > 
> -
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: 
> [EMAIL PROTECTED]
> > 
> > 
> > 
> -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Tim Colson
>But why not just use bean objects to a backend DB.  

Well, howabout because I explicitly posed the question as "just assume for a
moment that RAM is cheap and you decided to load 100K objects into memory"
instead of "I have a lot of data...what kind of thingy should I store it
in... oh, and please reply with small words because I am developmentally
challenged." 

Maybe that's why. ;-)

> that matter hand write the old incremental sort and sorted search
> routines.  

Apparently I wasn't clear -- I want to search using multiple criteria with
wildcards/booleans on multiple fields, and on data in contained objects. 

Mostly I'd rather not waste time re-inventing wheels, and [usually] the
folks on the list provide interesting food for thought. 

I won't bother with the flame-bait about "overly complex" and airguns.

Cheers,
Tim


> On Thu, 23 Dec 2004, Erik Hatcher wrote:
> 
> > Lucene
> > 
> > The query would be this "name:olson OR email:olson" if you 
> indexed that 
> > information into separate fields.  A common technique is to 
> index all 
> > data you want queryable also into an aggregate field in 
> which case the 
> > query could simply be "olson".
> > 
> > The full source code to Lucene in Action is at 
> > http://www.manning.com/hatcher2 - the ebook is available.  
> The physical 
> > book is shipping from the printers as we speak (UPS tracking says I 
> > should have gotten my batch yesterday, but it'll be today 
> it seems).  
> > http://www.lucenebook.com will go live within the week searching 
> > *inside* the book as well as a blog system I'm setting up.
> > 
> > Erik
> > 
> > On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
> > 
> > > So just assume for a moment that RAM is cheap and you 
> decided to load 
> > > 100K
> > > objects into memory. Assume those objects were 
> "Employees"... you can
> > > imagine the fields would be the usual suspects. Assume 
> each employee is
> > > associated with a profile that is another object, which 
> is composed of 
> > > a
> > > bunch of other data objects.
> > >
> > > What would you use to find/select objects like "Name or email foo 
> > > matches
> > > *olson* " ?
> > >
> > > Some possibilities:
> > > http://jakarta.apache.org/commons/jxpath/
> > >
> > > Some of the stuff inside Commons:
> > > http://jakarta.apache.org/commons/collections/
> > >
> > > Lucene indexes
> > > http://jakarta.apache.org/lucene/docs/
> > >
> > >
> > > Others?
> > >
> > > Tim
> > >
> > > 
> -
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: 
> [EMAIL PROTECTED]
> > 
> > 
> > 
> -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [jug-discussion] Searching large object graphs

2004-12-29 Thread Richard Hightower
I keep hitting my thumb with the Rock. I guess that is better than severing
my limb with the pneumatic hammer. 

Congrats on the book Erik. Lucene seems really cool. I hope to work with it
on a future project. My limbs seem to grow back.

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 29, 2004 3:12 PM
To: jug-discussion@tucson-jug.org
Subject: Re: [jug-discussion] Searching large object graphs

Not to be Trite... But why not just use bean objects to a backend DB.  Or
for that matter hand write the old incremental sort and sorted search
routines.  If it is all in memory then you should be able hand write an
index system capable of running through thousands of records in a fraction
of a second...  Just seems easier...  but then again I am not a CSE, so I
don't get a lot of joy out of using the overly complex to do the overly
simply just so I can learn about the overly complex for no more reason then
I may need it latter.  Or more simply, for me it is easier to hammer one
loose nail with a near by rock then to set up a pneumatic nail gun.  

On Thu, 23 Dec 2004, Erik Hatcher wrote:

> Lucene
> 
> The query would be this "name:olson OR email:olson" if you indexed 
> that information into separate fields.  A common technique is to index 
> all data you want queryable also into an aggregate field in which case 
> the query could simply be "olson".
> 
> The full source code to Lucene in Action is at
> http://www.manning.com/hatcher2 - the ebook is available.  The 
> physical book is shipping from the printers as we speak (UPS tracking 
> says I should have gotten my batch yesterday, but it'll be today it
seems).
> http://www.lucenebook.com will go live within the week searching
> *inside* the book as well as a blog system I'm setting up.
> 
>   Erik
> 
> On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
> 
> > So just assume for a moment that RAM is cheap and you decided to 
> > load 100K objects into memory. Assume those objects were 
> > "Employees"... you can imagine the fields would be the usual 
> > suspects. Assume each employee is associated with a profile that is 
> > another object, which is composed of a bunch of other data objects.
> >
> > What would you use to find/select objects like "Name or email foo 
> > matches
> > *olson* " ?
> >
> > Some possibilities:
> > http://jakarta.apache.org/commons/jxpath/
> >
> > Some of the stuff inside Commons:
> > http://jakarta.apache.org/commons/collections/
> >
> > Lucene indexes
> > http://jakarta.apache.org/lucene/docs/
> >
> >
> > Others?
> >
> > Tim
> >
> > 
> > - To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-29 Thread Bryan . ONeal
Not to be Trite... But why not just use bean objects to a backend DB.  Or for
that matter hand write the old incremental sort and sorted search
routines.  If it is all in memory then you should be able hand write an index
system capable of running through thousands of records in a fraction of a
second...  Just seems easier...  but then again I am not a CSE, so I don't get
a lot of joy out of using the overly complex to do the overly simply just so I
can learn about the overly complex for no more reason then I may need it
latter.  Or more simply, for me it is easier to hammer one loose nail with a
near by rock then to set up a pneumatic nail gun.  

On Thu, 23 Dec 2004, Erik Hatcher wrote:

> Lucene
> 
> The query would be this "name:olson OR email:olson" if you indexed that 
> information into separate fields.  A common technique is to index all 
> data you want queryable also into an aggregate field in which case the 
> query could simply be "olson".
> 
> The full source code to Lucene in Action is at 
> http://www.manning.com/hatcher2 - the ebook is available.  The physical 
> book is shipping from the printers as we speak (UPS tracking says I 
> should have gotten my batch yesterday, but it'll be today it seems).  
> http://www.lucenebook.com will go live within the week searching 
> *inside* the book as well as a blog system I'm setting up.
> 
>   Erik
> 
> On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
> 
> > So just assume for a moment that RAM is cheap and you decided to load 
> > 100K
> > objects into memory. Assume those objects were "Employees"... you can
> > imagine the fields would be the usual suspects. Assume each employee is
> > associated with a profile that is another object, which is composed of 
> > a
> > bunch of other data objects.
> >
> > What would you use to find/select objects like "Name or email foo 
> > matches
> > *olson* " ?
> >
> > Some possibilities:
> > http://jakarta.apache.org/commons/jxpath/
> >
> > Some of the stuff inside Commons:
> > http://jakarta.apache.org/commons/collections/
> >
> > Lucene indexes
> > http://jakarta.apache.org/lucene/docs/
> >
> >
> > Others?
> >
> > Tim
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jug-discussion] JavaOne CALL FOR PAPERS, was RE: [jug-discussion] Searching large object graphs

2004-12-23 Thread Richard Hightower
Hey Erik et al. 

I am glad to hear your Lucene in Action book is going to the printers. I
will order a copy ASAP.

BTW JavaOne 2005 is doing a call for papers. I was thinking about signing
up. You should think about it too. (The year I got accepted, I submitted 5
presentations, and they choose one b/c someone called in sick. The called me
last minute. I spoke on XDoclet making EJB CMP/CMR easier. Shudder...
Brrr...) 

I plan on being in town (Tucson) for the next six weeks or so (plans subject
to change). I am writing some articles for IBM and starting a book for
O'Rielly for my down time (Drew and I are working on it together).

Sorry I missed you in VA. I wanted to get together the last week, but my
schedule got crazy. 

When are you coming to Tucson?

I better get to work. There is no persecution like staring at a blank page.

BTW are there any Eclipse plugin/SWT experts in Tucson? that would not mind
traveling a bit to LA

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 23, 2004 3:05 AM
To: jug-discussion@tucson-jug.org
Subject: Re: [jug-discussion] Searching large object graphs

Lucene

The query would be this "name:olson OR email:olson" if you indexed that
information into separate fields.  A common technique is to index all data
you want queryable also into an aggregate field in which case the query
could simply be "olson".

The full source code to Lucene in Action is at
http://www.manning.com/hatcher2 - the ebook is available.  The physical book
is shipping from the printers as we speak (UPS tracking says I should have
gotten my batch yesterday, but it'll be today it seems).  
http://www.lucenebook.com will go live within the week searching
*inside* the book as well as a blog system I'm setting up.

Erik

On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:

> So just assume for a moment that RAM is cheap and you decided to load 
> 100K objects into memory. Assume those objects were "Employees"... you 
> can imagine the fields would be the usual suspects. Assume each 
> employee is associated with a profile that is another object, which is 
> composed of a bunch of other data objects.
>
> What would you use to find/select objects like "Name or email foo 
> matches
> *olson* " ?
>
> Some possibilities:
> http://jakarta.apache.org/commons/jxpath/
>
> Some of the stuff inside Commons:
> http://jakarta.apache.org/commons/collections/
>
> Lucene indexes
> http://jakarta.apache.org/lucene/docs/
>
>
> Others?
>
> Tim
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jug-discussion] Searching large object graphs

2004-12-23 Thread Erik Hatcher
Lucene
The query would be this "name:olson OR email:olson" if you indexed that 
information into separate fields.  A common technique is to index all 
data you want queryable also into an aggregate field in which case the 
query could simply be "olson".

The full source code to Lucene in Action is at 
http://www.manning.com/hatcher2 - the ebook is available.  The physical 
book is shipping from the printers as we speak (UPS tracking says I 
should have gotten my batch yesterday, but it'll be today it seems).  
http://www.lucenebook.com will go live within the week searching 
*inside* the book as well as a blog system I'm setting up.

Erik
On Dec 22, 2004, at 10:27 PM, Tim Colson wrote:
So just assume for a moment that RAM is cheap and you decided to load 
100K
objects into memory. Assume those objects were "Employees"... you can
imagine the fields would be the usual suspects. Assume each employee is
associated with a profile that is another object, which is composed of 
a
bunch of other data objects.

What would you use to find/select objects like "Name or email foo 
matches
*olson* " ?

Some possibilities:
http://jakarta.apache.org/commons/jxpath/
Some of the stuff inside Commons:
http://jakarta.apache.org/commons/collections/
Lucene indexes
http://jakarta.apache.org/lucene/docs/
Others?
Tim
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [jug-discussion] Searching large object graphs

2004-12-22 Thread Lesiecki Nicholas
OGNL

Nick
--- Tim Colson <[EMAIL PROTECTED]> wrote:

> So just assume for a moment that RAM is cheap and you decided to load
> 100K
> objects into memory. Assume those objects were "Employees"... you can
> imagine the fields would be the usual suspects. Assume each employee is
> associated with a profile that is another object, which is composed of a
> bunch of other data objects.
> 
> What would you use to find/select objects like "Name or email foo matches
> *olson* " ? 
> 
> Some possibilities:
> http://jakarta.apache.org/commons/jxpath/
> 
> Some of the stuff inside Commons:
> http://jakarta.apache.org/commons/collections/
> 
> Lucene indexes
> http://jakarta.apache.org/lucene/docs/
> 
> 
> Others?
> 
> Tim
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]