subject:"Re\: Hints on constructing\/running Solr analyzer chains standalone"

RE: Hints on constructing/running Solr analyzer chains standalone

2014-07-13 Thread Benson Margulies

Uwe, the last time I looked, Solr was perfectly cheerful about using
analysis components that did not advertise themselves via the factory SPI
system.  So someone might want to go further than calling the available
methods.
On Jul 12, 2014 7:24 PM, "Uwe Schindler"  wrote:

> The factories are part of Lucene, Solr is just using them. To list of
> available factories (in classpath) use
> (Tokenizer|TokenFilter|CharFilter)Factory.availableX() methods (to
> list all their names). You can invoke them using the corresponding
> forName() method and build an Analyzer from them. The latter has to be done
> manually, there is no general simple thing like Solr's chains. But that is
> quite easy to implement (if you really need an Analyzer instance). To just
> build a TokenStream for analysis, the factories is all you need (in fact
> Solr's chain just calls the factories in order... and returns it as
> TokenStreamComponents).
> You don't need to deal with SPI, just make the factories available in
> classpath, Lucene finds them automatically.
>
> For loading resources, use Lucene's ResourceLoader, which gets passed to
> the Factory's method inform() method. You only *need* to pass one, if and
> only if the factory implements ResourceLoaderAware. There are several
> ResourceLoaders available, Solr has its own very complicated one, but the
> default Lucene ones are: ClasspathResourceLoader, FilesystemResourceLoader.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
> > Sent: Saturday, July 12, 2014 7:17 PM
> > To: dev@lucene.apache.org
> > Subject: Re: Hints on constructing/running Solr analyzer chains
> standalone
> >
> > I don't want to read the schema.xml, but I do want to create factories
> using
> > the same parameters they use in schema. So, it looks like I need to play
> > around with ResourceLoaders and maybe SPI loaders, so things like
> wordlists
> > get loaded.
> >
> > Starting from FieldAnalyzer turned out to be a dead-end because it was
> using
> > pre-initialized field definitions. But starting again from Test cases
> seem to be
> > somewhat more productive.
> >
> > The idea for the project is to give a web UI where a user can quickly
> put one
> > or more analyzer stacks together and see how it/they perform against text
> > (multiple texts). A bit similar to FieldAnalyzer but allow to have
> multiple
> > stacks side-by-side and NOT needing to reload the core to add new ones.
> > Then, generate the XML definition, ready for pasting in. That's the
> target
> > anyway.
> >
> > Regards,
> >Alex.
> > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
> > http://www.solr-start.com/ and @solrstart Solr popularizers community:
> > https://www.linkedin.com/groups?gid=6713853
> >
> >
> > On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler  wrote:
> > > Hi,
> > >
> > >
> > >> H, I think it's reasonably straightforward to construct what is
> > >> implied by a Solr analysis chain in Lucene, would that do? Or do you
> > >> want to read a schema.xml file outside Solr?
> > >>
> > >> If the former, then you can pretty much skip the Solr code entirely.
> > >
> > > Read this:
> > >
> > http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/pa
> > > ckage-summary.html#package_description
> > >
> > > To do analysis, Solr is not needed at all, unless you want to read
> > schema.xml files. If you want to do this, that is quite easy using the
> > IndexSchema class. You can then get the analyzer from the field type or
> field
> > name. How to use the analyzer is described above and unrelated to Solr.
> > >
> > > Uwe
> > >
> > >> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch
> > >> 
> > >> wrote:
> > >> > Hello,
> > >> >
> > >> > I am interested in creating and running Solr analyzer chains
> > >> > outside of normal process (no live Solr). Just construct a chain,
> > >> > feed it tokens and see what happens.
> > >> >
> > >> > I would appreciate any hints on what that takes and whether there
> > >> > are any hidden/weird dependencies (e.g. for resource

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-13 Thread Jack Krupansky

I've been through all that code in Solr, and it sounds like you'd have to 
replicate its function. Wow, that's a truly ambitious task! Good Luck!

I'm sure that a fair amount of it could be refactored dramatically to be a 
lot simpler since Solr evolved piecemeal over the years, but... that's 
another monumental task.

And it would indeed be great to have a field type editor and field type API 
for the Solr Admin UI/API itself.

As Uwe indicated, the factories are already in Lucene, so all you need to do 
is generate their parameters from the field type filter parameters. But... 
for a friendly development tool you would probably like a lot more friendly 
parameter checking and error reporting than the raw exceptions (and weak 
validation) found in the traditional Solr/Lucene factories. Again, a lot of 
that could be refactored since it has evolved over the years, but... that's 
another monumental task. Still, Solr would so much the better for it.

And self-describing (and self-documenting) filter factories would be a 
fantastic improvement to Solr.

-- Jack Krupansky

-Original Message- 
From: Alexandre Rafalovitch

Sent: Saturday, July 12, 2014 1:16 PM
To: dev@lucene.apache.org
Subject: Re: Hints on constructing/running Solr analyzer chains standalone

I don't want to read the schema.xml, but I do want to create factories
using the same parameters they use in schema. So, it looks like I need
to play around with ResourceLoaders and maybe SPI loaders, so things
like wordlists get loaded.

Starting from FieldAnalyzer turned out to be a dead-end because it was
using pre-initialized field definitions. But starting again from Test
cases seem to be somewhat more productive.

The idea for the project is to give a web UI where a user can quickly
put one or more analyzer stacks together and see how it/they perform
against text (multiple texts). A bit similar to FieldAnalyzer but
allow to have multiple stacks side-by-side and NOT needing to reload
the core to add new ones. Then, generate the XML definition, ready for
pasting in. That's the target anyway.

Regards,
  Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler  wrote:

Hi,

H, I think it's reasonably straightforward to construct what is 
implied
by a Solr analysis chain in Lucene, would that do? Or do you want to read 
a

schema.xml file outside Solr?

If the former, then you can pretty much skip the Solr code entirely.

Read this: 
http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description

To do analysis, Solr is not needed at all, unless you want to read 
schema.xml files. If you want to do this, that is quite easy using the 
IndexSchema class. You can then get the analyzer from the field type or 
field name. How to use the analyzer is described above and unrelated to 
Solr.

Uwe

On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch 

wrote:
> Hello,
>
> I am interested in creating and running Solr analyzer chains outside
> of normal process (no live Solr). Just construct a chain, feed it
> tokens and see what happens.
>
> I would appreciate any hints on what that takes and whether there are
> any hidden/weird dependencies (e.g. for resource discoveries). I tried
> tracing through FieldAnalysis calls, but can't actually seem to find
> the point where the actual analysis is done. Just getting lost in sets
> of NamedList
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
> http://www.solr-start.com/ and @solrstart Solr popularizers community:
> https://www.linkedin.com/groups?gid=6713853
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
> additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Alexandre Rafalovitch

Right,

For the first cut, I am not planning to let people to edit things like
synonyms files. Just select from a pre-existing list/dropdown. And
I'll most probably start by providing a bunch of fixed stacks taken
from example schemas. So, none of the Solr's flexibility is requited,
just need to wire it all up correctly.

One of the limiting factors is that the factories are NOT
self-describing. So, I can't figure out what parameter is allowed,
what form it takes and what it's description is. So, probably will
have to hard-code that somewhere.

And if that turns out to be too hard Well, let's just say I have a
very long list of cool projects and am looking for the most impact for
my time investment. But from digging into the sources yesterday, the
backend looks quite doable. The front-end is - of course - always more
of a challenge.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Sun, Jul 13, 2014 at 10:55 AM, Erick Erickson
 wrote:
> Hmmm, sounds pretty cool!
>
> I wonder if it would be sufficient, for the first cut anyway, to let the user
> specify whatever was necessary to bypass all the ResourceLoader stuff,
> why make the user put the files in a place Solr knows about? Instead, for
> development, it might be sufficient (and less error prone) to require them
> to give the UI the information.
>
> Of course _you're_ the one doing the work, so whatever you think best.
>
> Erick
>
> On Sat, Jul 12, 2014 at 10:16 AM, Alexandre Rafalovitch
>  wrote:
>> I don't want to read the schema.xml, but I do want to create factories
>> using the same parameters they use in schema. So, it looks like I need
>> to play around with ResourceLoaders and maybe SPI loaders, so things
>> like wordlists get loaded.
>>
>> Starting from FieldAnalyzer turned out to be a dead-end because it was
>> using pre-initialized field definitions. But starting again from Test
>> cases seem to be somewhat more productive.
>>
>> The idea for the project is to give a web UI where a user can quickly
>> put one or more analyzer stacks together and see how it/they perform
>> against text (multiple texts). A bit similar to FieldAnalyzer but
>> allow to have multiple stacks side-by-side and NOT needing to reload
>> the core to add new ones. Then, generate the XML definition, ready for
>> pasting in. That's the target anyway.
>>
>> Regards,
>>Alex.
>> Personal: http://www.outerthoughts.com/ and @arafalov
>> Solr resources: http://www.solr-start.com/ and @solrstart
>> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>>
>>
>> On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler  wrote:
>>> Hi,
>>>
>>>
 H, I think it's reasonably straightforward to construct what is implied
 by a Solr analysis chain in Lucene, would that do? Or do you want to read a
 schema.xml file outside Solr?

 If the former, then you can pretty much skip the Solr code entirely.
>>>
>>> Read this: 
>>> http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description
>>>
>>> To do analysis, Solr is not needed at all, unless you want to read 
>>> schema.xml files. If you want to do this, that is quite easy using the 
>>> IndexSchema class. You can then get the analyzer from the field type or 
>>> field name. How to use the analyzer is described above and unrelated to 
>>> Solr.
>>>
>>> Uwe
>>>
 On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch 
 wrote:
 > Hello,
 >
 > I am interested in creating and running Solr analyzer chains outside
 > of normal process (no live Solr). Just construct a chain, feed it
 > tokens and see what happens.
 >
 > I would appreciate any hints on what that takes and whether there are
 > any hidden/weird dependencies (e.g. for resource discoveries). I tried
 > tracing through FieldAnalysis calls, but can't actually seem to find
 > the point where the actual analysis is done. Just getting lost in sets
 > of NamedList>>> >
 > Regards,
 >Alex.
 > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
 > http://www.solr-start.com/ and @solrstart Solr popularizers community:
 > https://www.linkedin.com/groups?gid=6713853
 >
 > -
 > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 > additional commands, e-mail: dev-h...@lucene.apache.org
 >

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-m

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Erick Erickson

Hmmm, sounds pretty cool!

I wonder if it would be sufficient, for the first cut anyway, to let the user
specify whatever was necessary to bypass all the ResourceLoader stuff,
why make the user put the files in a place Solr knows about? Instead, for
development, it might be sufficient (and less error prone) to require them
to give the UI the information.

Of course _you're_ the one doing the work, so whatever you think best.

Erick

On Sat, Jul 12, 2014 at 10:16 AM, Alexandre Rafalovitch
 wrote:
> I don't want to read the schema.xml, but I do want to create factories
> using the same parameters they use in schema. So, it looks like I need
> to play around with ResourceLoaders and maybe SPI loaders, so things
> like wordlists get loaded.
>
> Starting from FieldAnalyzer turned out to be a dead-end because it was
> using pre-initialized field definitions. But starting again from Test
> cases seem to be somewhat more productive.
>
> The idea for the project is to give a web UI where a user can quickly
> put one or more analyzer stacks together and see how it/they perform
> against text (multiple texts). A bit similar to FieldAnalyzer but
> allow to have multiple stacks side-by-side and NOT needing to reload
> the core to add new ones. Then, generate the XML definition, ready for
> pasting in. That's the target anyway.
>
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler  wrote:
>> Hi,
>>
>>
>>> H, I think it's reasonably straightforward to construct what is implied
>>> by a Solr analysis chain in Lucene, would that do? Or do you want to read a
>>> schema.xml file outside Solr?
>>>
>>> If the former, then you can pretty much skip the Solr code entirely.
>>
>> Read this: 
>> http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description
>>
>> To do analysis, Solr is not needed at all, unless you want to read 
>> schema.xml files. If you want to do this, that is quite easy using the 
>> IndexSchema class. You can then get the analyzer from the field type or 
>> field name. How to use the analyzer is described above and unrelated to Solr.
>>
>> Uwe
>>
>>> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch 
>>> wrote:
>>> > Hello,
>>> >
>>> > I am interested in creating and running Solr analyzer chains outside
>>> > of normal process (no live Solr). Just construct a chain, feed it
>>> > tokens and see what happens.
>>> >
>>> > I would appreciate any hints on what that takes and whether there are
>>> > any hidden/weird dependencies (e.g. for resource discoveries). I tried
>>> > tracing through FieldAnalysis calls, but can't actually seem to find
>>> > the point where the actual analysis is done. Just getting lost in sets
>>> > of NamedList>> >
>>> > Regards,
>>> >Alex.
>>> > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
>>> > http://www.solr-start.com/ and @solrstart Solr popularizers community:
>>> > https://www.linkedin.com/groups?gid=6713853
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>>> > additional commands, e-mail: dev-h...@lucene.apache.org
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>>> commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread david.w.smi...@gmail.com

That sounds like a wonderful project, Alexandre — I’ve always wanted such a
capability!

I suggest approaching this very pragmatically based on minimizing the time
to get something useful, which means leveraging as much as is available
already — that means solr’s existing analysis UI screen.  I suggest
modifying the FieldAnalysisRequestHandler could take optional input of a
provided XML fieldType definition in the request instead of using the live
schema.  It would create a new temporary SolrSchema based on the provided
data, then re-use the rest of its field analyzing code based on that
schema.   Disclaimer: I have yet to look at FieldAnalysisRequestHandler.

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley

On Sat, Jul 12, 2014 at 1:16 PM, Alexandre Rafalovitch 
wrote:

> I don't want to read the schema.xml, but I do want to create factories
> using the same parameters they use in schema. So, it looks like I need
> to play around with ResourceLoaders and maybe SPI loaders, so things
> like wordlists get loaded.
>
> Starting from FieldAnalyzer turned out to be a dead-end because it was
> using pre-initialized field definitions. But starting again from Test
> cases seem to be somewhat more productive.
>
> The idea for the project is to give a web UI where a user can quickly
> put one or more analyzer stacks together and see how it/they perform
> against text (multiple texts). A bit similar to FieldAnalyzer but
> allow to have multiple stacks side-by-side and NOT needing to reload
> the core to add new ones. Then, generate the XML definition, ready for
> pasting in. That's the target anyway.
>
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>

RE: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Uwe Schindler

The factories are part of Lucene, Solr is just using them. To list of available 
factories (in classpath) use
(Tokenizer|TokenFilter|CharFilter)Factory.availableX() methods (to list all 
their names). You can invoke them using the corresponding forName() method and 
build an Analyzer from them. The latter has to be done manually, there is no 
general simple thing like Solr's chains. But that is quite easy to implement 
(if you really need an Analyzer instance). To just build a TokenStream for 
analysis, the factories is all you need (in fact Solr's chain just calls the 
factories in order... and returns it as TokenStreamComponents).
You don't need to deal with SPI, just make the factories available in 
classpath, Lucene finds them automatically.

For loading resources, use Lucene's ResourceLoader, which gets passed to the 
Factory's method inform() method. You only *need* to pass one, if and only if 
the factory implements ResourceLoaderAware. There are several ResourceLoaders 
available, Solr has its own very complicated one, but the default Lucene ones 
are: ClasspathResourceLoader, FilesystemResourceLoader.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
> Sent: Saturday, July 12, 2014 7:17 PM
> To: dev@lucene.apache.org
> Subject: Re: Hints on constructing/running Solr analyzer chains standalone
> 
> I don't want to read the schema.xml, but I do want to create factories using
> the same parameters they use in schema. So, it looks like I need to play
> around with ResourceLoaders and maybe SPI loaders, so things like wordlists
> get loaded.
> 
> Starting from FieldAnalyzer turned out to be a dead-end because it was using
> pre-initialized field definitions. But starting again from Test cases seem to 
> be
> somewhat more productive.
> 
> The idea for the project is to give a web UI where a user can quickly put one
> or more analyzer stacks together and see how it/they perform against text
> (multiple texts). A bit similar to FieldAnalyzer but allow to have multiple
> stacks side-by-side and NOT needing to reload the core to add new ones.
> Then, generate the XML definition, ready for pasting in. That's the target
> anyway.
> 
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
> http://www.solr-start.com/ and @solrstart Solr popularizers community:
> https://www.linkedin.com/groups?gid=6713853
> 
> 
> On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler  wrote:
> > Hi,
> >
> >
> >> H, I think it's reasonably straightforward to construct what is
> >> implied by a Solr analysis chain in Lucene, would that do? Or do you
> >> want to read a schema.xml file outside Solr?
> >>
> >> If the former, then you can pretty much skip the Solr code entirely.
> >
> > Read this:
> >
> http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/pa
> > ckage-summary.html#package_description
> >
> > To do analysis, Solr is not needed at all, unless you want to read
> schema.xml files. If you want to do this, that is quite easy using the
> IndexSchema class. You can then get the analyzer from the field type or field
> name. How to use the analyzer is described above and unrelated to Solr.
> >
> > Uwe
> >
> >> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch
> >> 
> >> wrote:
> >> > Hello,
> >> >
> >> > I am interested in creating and running Solr analyzer chains
> >> > outside of normal process (no live Solr). Just construct a chain,
> >> > feed it tokens and see what happens.
> >> >
> >> > I would appreciate any hints on what that takes and whether there
> >> > are any hidden/weird dependencies (e.g. for resource discoveries).
> >> > I tried tracing through FieldAnalysis calls, but can't actually
> >> > seem to find the point where the actual analysis is done. Just
> >> > getting lost in sets of NamedList >> >
> >> > Regards,
> >> >Alex.
> >> > Personal: http://www.outerthoughts.com/ and @arafalov Solr
> resources:
> >> > http://www.solr-start.com/ and @solrstart Solr popularizers
> community:
> >> > https://www.linkedin.com/groups?gid=6713853
> >> >
> >> > ---
> >> > -- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
> >> > additional commands, e-mail: dev-h...@lucene.apache.org
> >> &g

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Alexandre Rafalovitch

I don't want to read the schema.xml, but I do want to create factories
using the same parameters they use in schema. So, it looks like I need
to play around with ResourceLoaders and maybe SPI loaders, so things
like wordlists get loaded.

Starting from FieldAnalyzer turned out to be a dead-end because it was
using pre-initialized field definitions. But starting again from Test
cases seem to be somewhat more productive.

The idea for the project is to give a web UI where a user can quickly
put one or more analyzer stacks together and see how it/they perform
against text (multiple texts). A bit similar to FieldAnalyzer but
allow to have multiple stacks side-by-side and NOT needing to reload
the core to add new ones. Then, generate the XML definition, ready for
pasting in. That's the target anyway.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler  wrote:
> Hi,
>
>
>> H, I think it's reasonably straightforward to construct what is implied
>> by a Solr analysis chain in Lucene, would that do? Or do you want to read a
>> schema.xml file outside Solr?
>>
>> If the former, then you can pretty much skip the Solr code entirely.
>
> Read this: 
> http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description
>
> To do analysis, Solr is not needed at all, unless you want to read schema.xml 
> files. If you want to do this, that is quite easy using the IndexSchema 
> class. You can then get the analyzer from the field type or field name. How 
> to use the analyzer is described above and unrelated to Solr.
>
> Uwe
>
>> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch 
>> wrote:
>> > Hello,
>> >
>> > I am interested in creating and running Solr analyzer chains outside
>> > of normal process (no live Solr). Just construct a chain, feed it
>> > tokens and see what happens.
>> >
>> > I would appreciate any hints on what that takes and whether there are
>> > any hidden/weird dependencies (e.g. for resource discoveries). I tried
>> > tracing through FieldAnalysis calls, but can't actually seem to find
>> > the point where the actual analysis is done. Just getting lost in sets
>> > of NamedList> >
>> > Regards,
>> >Alex.
>> > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
>> > http://www.solr-start.com/ and @solrstart Solr popularizers community:
>> > https://www.linkedin.com/groups?gid=6713853
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>> > additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Uwe Schindler

Hi,


> H, I think it's reasonably straightforward to construct what is implied
> by a Solr analysis chain in Lucene, would that do? Or do you want to read a
> schema.xml file outside Solr?
> 
> If the former, then you can pretty much skip the Solr code entirely.

Read this: 
http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description

To do analysis, Solr is not needed at all, unless you want to read schema.xml 
files. If you want to do this, that is quite easy using the IndexSchema class. 
You can then get the analyzer from the field type or field name. How to use the 
analyzer is described above and unrelated to Solr.

Uwe

> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch 
> wrote:
> > Hello,
> >
> > I am interested in creating and running Solr analyzer chains outside
> > of normal process (no live Solr). Just construct a chain, feed it
> > tokens and see what happens.
> >
> > I would appreciate any hints on what that takes and whether there are
> > any hidden/weird dependencies (e.g. for resource discoveries). I tried
> > tracing through FieldAnalysis calls, but can't actually seem to find
> > the point where the actual analysis is done. Just getting lost in sets
> > of NamedList >
> > Regards,
> >Alex.
> > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources:
> > http://www.solr-start.com/ and @solrstart Solr popularizers community:
> > https://www.linkedin.com/groups?gid=6713853
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
> > additional commands, e-mail: dev-h...@lucene.apache.org
> >
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Erick Erickson

H, I think it's reasonably straightforward to construct what
is implied by a Solr analysis chain in Lucene, would that do? Or
do you want to read a schema.xml file outside Solr?

If the former, then you can pretty much skip the Solr code entirely.

FWIW,
Erick

On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch
 wrote:
> Hello,
>
> I am interested in creating and running Solr analyzer chains outside
> of normal process (no live Solr). Just construct a chain, feed it
> tokens and see what happens.
>
> I would appreciate any hints on what that takes and whether there are
> any hidden/weird dependencies (e.g. for resource discoveries). I tried
> tracing through FieldAnalysis calls, but can't actually seem to find
> the point where the actual analysis is done. Just getting lost in sets
> of NamedList
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Alexandre Rafalovitch

Uhm. That's where I did start. :-(

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

On Sat, Jul 12, 2014 at 9:50 PM, Jack Krupansky  wrote:
> Tracing through indexing or query parsing is... a challenge. Start with
> something simpler like the analysis admin API.
>
> See:
> http://lucene.apache.org/solr/4_9_0/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html
>
> -- Jack Krupansky
>
> -Original Message- From: Alexandre Rafalovitch
> Sent: Saturday, July 12, 2014 9:59 AM
> To: dev@lucene.apache.org
> Subject: Hints on constructing/running Solr analyzer chains standalone
>
>
> Hello,
>
> I am interested in creating and running Solr analyzer chains outside
> of normal process (no live Solr). Just construct a chain, feed it
> tokens and see what happens.
>
> I would appreciate any hints on what that takes and whether there are
> any hidden/weird dependencies (e.g. for resource discoveries). I tried
> tracing through FieldAnalysis calls, but can't actually seem to find
> the point where the actual analysis is done. Just getting lost in sets
> of NamedList

Re: Hints on constructing/running Solr analyzer chains standalone

2014-07-12 Thread Jack Krupansky

Tracing through indexing or query parsing is... a challenge. Start with 
something simpler like the analysis admin API.


See:
http://lucene.apache.org/solr/4_9_0/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html

-- Jack Krupansky

-Original Message- 
From: Alexandre Rafalovitch

Sent: Saturday, July 12, 2014 9:59 AM
To: dev@lucene.apache.org
Subject: Hints on constructing/running Solr analyzer chains standalone

Hello,

I am interested in creating and running Solr analyzer chains outside
of normal process (no live Solr). Just construct a chain, feed it
tokens and see what happens.

I would appreciate any hints on what that takes and whether there are
any hidden/weird dependencies (e.g. for resource discoveries). I tried
tracing through FieldAnalysis calls, but can't actually seem to find
the point where the actual analysis is done. Just getting lost in sets
of NamedListhttp://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

RE: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

RE: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

Re: Hints on constructing/running Solr analyzer chains standalone

11 matches

Site Navigation

Mail list logo

Footer information