Re: ant example, tika

2008-12-16 Thread Chris Hostetter

: I think, eventually, and I really hate to say this b/c classloading is a
: nightmare, but we may want to look into isolated classloaders or OSGi or
: something for the Solr Home lib directory.  The benefits being that I already
: see library collisions in our future.

we already have clasloaders isolated by SolrCore.  multiple plugins used 
by the same core with conflicting dependencies might cause library 
collision, but i don't see how that would be much different using any 
other approach.  if we switch to a config management framework that 
already has plugin management as a feature then by all means lets use 
it, but i don't think we need to go looking for a change -- we've alrady 
done the hard work.


-Hoss



Re: ant example, tika

2008-12-15 Thread Otis Gospodnetic
For what it's worth, I'm pro keeping contrib things in their own jars.  It's 
easy to figure out what you need to include when/if you need it, but it pains 
me to carry around jars with code that I have absolutely no need for.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Chris Hostetter hossman_luc...@fucit.org
 To: solr-dev@lucene.apache.org
 Sent: Saturday, December 13, 2008 2:12:01 AM
 Subject: Re: ant example, tika
 
 
 : The only issue I see now is that DIH has been released as part of the core, 
 so
 : I would vote that it stays in there.  It is also quite popular, I think, so
 : I'd hate to break people.
 
 ...which is why having a kitchen-sink war with all the contribs might make 
 sense.  But frankly i don't see it as a very problematic to document how 
 to use a DIH jar for people who upgrade ... we have to document how to use 
 contribs in general.
 
 
 
 -Hoss



Re: ant example, tika

2008-12-15 Thread Grant Ingersoll

Just to add some more fun to the mix:
I think, eventually, and I really hate to say this b/c classloading is  
a nightmare, but we may want to look into isolated classloaders or  
OSGi or something for the Solr Home lib directory.  The benefits being  
that I already see library collisions in our future.



On Dec 15, 2008, at 12:50 PM, Otis Gospodnetic wrote:

For what it's worth, I'm pro keeping contrib things in their own  
jars.  It's easy to figure out what you need to include when/if you  
need it, but it pains me to carry around jars with code that I have  
absolutely no need for.



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Chris Hostetter hossman_luc...@fucit.org
To: solr-dev@lucene.apache.org
Sent: Saturday, December 13, 2008 2:12:01 AM
Subject: Re: ant example, tika


: The only issue I see now is that DIH has been released as part of  
the core, so
: I would vote that it stays in there.  It is also quite popular, I  
think, so

: I'd hate to break people.

...which is why having a kitchen-sink war with all the contribs  
might make
sense.  But frankly i don't see it as a very problematic to  
document how
to use a DIH jar for people who upgrade ... we have to document how  
to use

contribs in general.



-Hoss




--
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












Re: ant example, tika

2008-12-12 Thread Grant Ingersoll


On Dec 11, 2008, at 10:50 PM, Chris Hostetter wrote:



: Ignoring the JSP dilemma... DIH's JAR doesn't need to be in the  
WAR, but can
: ship in a lib/ directory outside the WAR and come in as a plugin.   
And Solr

: can ship with all of the contribs wired in to a kitchen-sink example
: configuration.
:
: There is merit to keeping Solr's WAR and core to the most minimal  
size
: possible and leveraging the plugin capability to let users reduce  
the

: footprint and un-used parts.

+1 ... there really shouldn't be any contrib's in the war.  If we're
worried that asking people to put the DIH jar in the plugin  
directory is
too complicated for new users to understand (and i really can't  
believe
that: if someone can understand ow to write a data-config.xml then  
copying
a jar file should be trivial) we can make a solr-kitchen-sink.war  
that
contains *every* contrib and *every* dependency in addition to the  
regular

one.

But even that seems less useful in general then having a more robust  
set
of examples -- where each one gets a lib directory populated with  
just the
plugins it's demonstrating (and possibly a kitchen-sink example  
showing

off all of them)

Honestly: I didn't even realize DIH was adding itself to the war  
untill

recently, but then again i've been a little out of touch.




The only issue I see now is that DIH has been released as part of the  
core, so I would vote that it stays in there.  It is also quite  
popular, I think, so I'd hate to break people.


Re: ant example, tika

2008-12-12 Thread Grant Ingersoll
It occurred to me that we could also add a core-example target that  
only builds the core example for those impatient types w/ slow  
machines ;-)



On Dec 12, 2008, at 8:03 AM, Grant Ingersoll wrote:



On Dec 11, 2008, at 10:50 PM, Chris Hostetter wrote:



: Ignoring the JSP dilemma... DIH's JAR doesn't need to be in the  
WAR, but can
: ship in a lib/ directory outside the WAR and come in as a  
plugin.  And Solr
: can ship with all of the contribs wired in to a kitchen-sink  
example

: configuration.
:
: There is merit to keeping Solr's WAR and core to the most minimal  
size
: possible and leveraging the plugin capability to let users reduce  
the

: footprint and un-used parts.

+1 ... there really shouldn't be any contrib's in the war.  If we're
worried that asking people to put the DIH jar in the plugin  
directory is
too complicated for new users to understand (and i really can't  
believe
that: if someone can understand ow to write a data-config.xml then  
copying
a jar file should be trivial) we can make a solr-kitchen-sink.war  
that
contains *every* contrib and *every* dependency in addition to the  
regular

one.

But even that seems less useful in general then having a more  
robust set
of examples -- where each one gets a lib directory populated with  
just the
plugins it's demonstrating (and possibly a kitchen-sink example  
showing

off all of them)

Honestly: I didn't even realize DIH was adding itself to the war  
untill

recently, but then again i've been a little out of touch.




The only issue I see now is that DIH has been released as part of  
the core, so I would vote that it stays in there.  It is also quite  
popular, I think, so I'd hate to break people.





Re: ant example, tika

2008-12-12 Thread Chris Hostetter

: The only issue I see now is that DIH has been released as part of the core, so
: I would vote that it stays in there.  It is also quite popular, I think, so
: I'd hate to break people.

...which is why having a kitchen-sink war with all the contribs might make 
sense.  But frankly i don't see it as a very problematic to document how 
to use a DIH jar for people who upgrade ... we have to document how to use 
contribs in general.



-Hoss



Re: ant example, tika

2008-12-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is not a part of the war, the first users will have trouble in
'getting started'
It is already used  by most of the first time users. So do we want to
change the expectation ?

On Mon, Dec 8, 2008 at 1:13 PM, Erik Hatcher [EMAIL PROTECTED] wrote:

 On Dec 8, 2008, at 2:33 AM, Shalin Shekhar Mangar wrote:

 On Mon, Dec 8, 2008 at 1:02 AM, Ryan McKinley [EMAIL PROTECTED] wrote:


 Also, it looks like DataImportHandler puts itself in the war file -- I
 don't think we want that either.


 One big difference between DataImportHandler and other contribs is that it
 has zero extra dependencies and it contains a JSP as well. Plus, it is
 becoming very popular. I'd like to keep it in the war by default.

 Without the JSP it doesn't need to live in the WAR.  But then again, if we
 remove the JSPs by my proposal, we end up with VelocityResponseWriter in the
 WAR :)  [and VrW does have additional dependencies, and then DIH would
 depend on VrW - LOL]

Erik





-- 
--Noble Paul


Re: ant example, tika

2008-12-08 Thread Erik Hatcher

Regarding DIH:

On Dec 8, 2008, at 3:00 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:

it is not a part of the war, the first users will have trouble in
'getting started'
It is already used  by most of the first time users. So do we want to
change the expectation ?


Ignoring the JSP dilemma... DIH's JAR doesn't need to be in the WAR,  
but can ship in a lib/ directory outside the WAR and come in as a  
plugin.  And Solr can ship with all of the contribs wired in to a  
kitchen-sink example configuration.


There is merit to keeping Solr's WAR and core to the most minimal size  
possible and leveraging the plugin capability to let users reduce the  
footprint and un-used parts.


I'm just sayin'

Erik



Re: ant example, tika

2008-12-07 Thread Ryan McKinley


On Dec 7, 2008, at 11:42 AM, Yonik Seeley wrote:


I notice that SOLR-284 (extraction data handler + tika) is in the
default example now.
Is this what we want?


I don't think so.

Also, it looks like DataImportHandler puts itself in the war file -- I  
don't think we want that either.





On the topic of example, it used to be much faster in the past to
use for debugging... now when I do ant example, I go get a cup of
coffee.



Agreed -- I would hope that ant run-example would compile and run  
the minimum setup required.  We can have a different setup that would  
do more things:  dataimporthandler, velocity, and extraction.


ryan


Re: ant example, tika

2008-12-07 Thread Grant Ingersoll
Tika shouldn't be in the example.  It just puts the libs there but is  
not hooked into the config.



On Dec 7, 2008, at 2:32 PM, Ryan McKinley wrote:



On Dec 7, 2008, at 11:42 AM, Yonik Seeley wrote:


I notice that SOLR-284 (extraction data handler + tika) is in the
default example now.
Is this what we want?


I don't think so.

Also, it looks like DataImportHandler puts itself in the war file --  
I don't think we want that either.





On the topic of example, it used to be much faster in the past to
use for debugging... now when I do ant example, I go get a cup of
coffee.



Agreed -- I would hope that ant run-example would compile and run  
the minimum setup required.  We can have a different setup that  
would do more things:  dataimporthandler, velocity, and extraction.


ryan





Re: ant example, tika

2008-12-07 Thread Erik Hatcher


On Dec 7, 2008, at 2:32 PM, Ryan McKinley wrote:

On Dec 7, 2008, at 11:42 AM, Yonik Seeley wrote:


I notice that SOLR-284 (extraction data handler + tika) is in the
default example now.
Is this what we want?


I don't think so.

Also, it looks like DataImportHandler puts itself in the war file --  
I don't think we want that either.


DIH has a JSP that needs to be in the WAR, unfortunately.

Interestingly, the JSP could be done away with and the  
VelocityResponseWriter could be used, with templates externalized (in  
the DIH JAR file, for example) and plugged in as a true lib/ plugin.


Erik



Re: ant example, tika

2008-12-07 Thread Ryan McKinley


On Dec 7, 2008, at 3:04 PM, Erik Hatcher wrote:



On Dec 7, 2008, at 2:32 PM, Ryan McKinley wrote:

On Dec 7, 2008, at 11:42 AM, Yonik Seeley wrote:


I notice that SOLR-284 (extraction data handler + tika) is in the
default example now. scriplet
Is this what we want?


I don't think so.

Also, it looks like DataImportHandler puts itself in the war file  
-- I don't think we want that either.


DIH has a JSP that needs to be in the WAR, unfortunately.

Interestingly, the JSP could be done away with and the  
VelocityResponseWriter could be used, with templates externalized  
(in the DIH JAR file, for example) and plugged in as a true lib/  
plugin.




yes, it would be nice to get rid of all .jsp and use some other system  
(i don't care what, but velocity seems like a good choice).


This would let all the admin goodness be available from embedded  
systems as well as the standard stack.  Also it would get rid of the  
untestable stuff.


However, this may be on the 2.0 list :(

ryan


Re: ant example, tika

2008-12-07 Thread Yonik Seeley
On Sun, Dec 7, 2008 at 2:47 PM, Grant Ingersoll [EMAIL PROTECTED] wrote:
 Tika shouldn't be in the example.  It just puts the libs there but is not
 hooked into the config.

Hmmm, so it's half installed.  Seems like we should either add in the
extraction handler (perhaps lazily loaded if it takes up resources) to
solrconfig.xml in example, or remove the extraction jars from lib.

-Yonik


Re: ant example, tika

2008-12-07 Thread Grant Ingersoll
The jars aren't checked in to the example.  The thing is, the  
extraction stuff is a contrib and not on by default and not packaged  
into the WAR.  DIH currently does package into the WAR, but I don't  
think contribs should do that.


Frankly, I think the better answer is an overhaul of the examples  
directory, like we discussed in the clean up thread that is also  
taking place.  Keep the main example nice and simple, and then have  
more organized other examples.



On Dec 7, 2008, at 3:17 PM, Yonik Seeley wrote:

On Sun, Dec 7, 2008 at 2:47 PM, Grant Ingersoll  
[EMAIL PROTECTED] wrote:
Tika shouldn't be in the example.  It just puts the libs there but  
is not

hooked into the config.


Hmmm, so it's half installed.  Seems like we should either add in the
extraction handler (perhaps lazily loaded if it takes up resources) to
solrconfig.xml in example, or remove the extraction jars from lib.

-Yonik





Re: ant example, tika

2008-12-07 Thread Yonik Seeley
On Sun, Dec 7, 2008 at 5:50 PM, Grant Ingersoll [EMAIL PROTECTED] wrote:
 The jars aren't checked in to the example.

But ant example puts them there - doesn't matter from a user perspective.

 The thing is, the extraction
 stuff is a contrib and not on by default and not packaged into the WAR.

But then why do all the jars still get copied into the example lib directory?
Is this intentional?

-Yonik

  DIH
 currently does package into the WAR, but I don't think contribs should do
 that.

 Frankly, I think the better answer is an overhaul of the examples directory,
 like we discussed in the clean up thread that is also taking place.  Keep
 the main example nice and simple, and then have more organized other
 examples.


 On Dec 7, 2008, at 3:17 PM, Yonik Seeley wrote:

 On Sun, Dec 7, 2008 at 2:47 PM, Grant Ingersoll [EMAIL PROTECTED]
 wrote:

 Tika shouldn't be in the example.  It just puts the libs there but is not
 hooked into the config.

 Hmmm, so it's half installed.  Seems like we should either add in the
 extraction handler (perhaps lazily loaded if it takes up resources) to
 solrconfig.xml in example, or remove the extraction jars from lib.

 -Yonik





Re: ant example, tika

2008-12-07 Thread Ryan McKinley


On Dec 7, 2008, at 5:50 PM, Grant Ingersoll wrote:

The jars aren't checked in to the example.  The thing is, the  
extraction stuff is a contrib and not on by default and not packaged  
into the WAR.  DIH currently does package into the WAR, but I don't  
think contribs should do that.


Frankly, I think the better answer is an overhaul of the examples  
directory, like we discussed in the clean up thread that is also  
taking place.  Keep the main example nice and simple, and then have  
more organized other examples.




+1

Rethinking the example directory structure / process is a good idea...


Re: ant example, tika

2008-12-07 Thread Grant Ingersoll


On Dec 7, 2008, at 5:56 PM, Yonik Seeley wrote:

On Sun, Dec 7, 2008 at 5:50 PM, Grant Ingersoll  
[EMAIL PROTECTED] wrote:

The jars aren't checked in to the example.


But ant example puts them there - doesn't matter from a user  
perspective.



The thing is, the extraction
stuff is a contrib and not on by default and not packaged into the  
WAR.


But then why do all the jars still get copied into the example lib  
directory?

Is this intentional?


Yes, it is intentional.  The patch originally had the ERH turned on in  
the example, but then crazily enough, the core unit tests have  
dependencies on the example, so go figure.  So, I thought this was a  
compromise.  Have the example ready to go if someone runs ant  
example but not break the unit tests that for some reason depend on  
example code.


Like I said, I think we really need take a step back and better  
organize the code and think just a little bit more about packaging,  
core, contribs, clients, etc. b/c right now it's all a big mish-mash.


-Grant


Re: ant example, tika

2008-12-07 Thread Shalin Shekhar Mangar
On Mon, Dec 8, 2008 at 1:02 AM, Ryan McKinley [EMAIL PROTECTED] wrote:


 Also, it looks like DataImportHandler puts itself in the war file -- I
 don't think we want that either.


One big difference between DataImportHandler and other contribs is that it
has zero extra dependencies and it contains a JSP as well. Plus, it is
becoming very popular. I'd like to keep it in the war by default.

-- 
Regards,
Shalin Shekhar Mangar.


Re: ant example, tika

2008-12-07 Thread Erik Hatcher


On Dec 8, 2008, at 2:33 AM, Shalin Shekhar Mangar wrote:
On Mon, Dec 8, 2008 at 1:02 AM, Ryan McKinley [EMAIL PROTECTED]  
wrote:




Also, it looks like DataImportHandler puts itself in the war file  
-- I

don't think we want that either.



One big difference between DataImportHandler and other contribs is  
that it

has zero extra dependencies and it contains a JSP as well. Plus, it is
becoming very popular. I'd like to keep it in the war by default.


Without the JSP it doesn't need to live in the WAR.  But then again,  
if we remove the JSPs by my proposal, we end up with  
VelocityResponseWriter in the WAR :)  [and VrW does have additional  
dependencies, and then DIH would depend on VrW - LOL]


Erik