RE: Hints on constructing/running Solr analyzer chains standalone
Uwe, the last time I looked, Solr was perfectly cheerful about using analysis components that did not advertise themselves via the factory SPI system. So someone might want to go further than calling the available methods. On Jul 12, 2014 7:24 PM, "Uwe Schindler" wrote: > The factories are part of Lucene, Solr is just using them. To list of > available factories (in classpath) use > (Tokenizer|TokenFilter|CharFilter)Factory.availableX() methods (to > list all their names). You can invoke them using the corresponding > forName() method and build an Analyzer from them. The latter has to be done > manually, there is no general simple thing like Solr's chains. But that is > quite easy to implement (if you really need an Analyzer instance). To just > build a TokenStream for analysis, the factories is all you need (in fact > Solr's chain just calls the factories in order... and returns it as > TokenStreamComponents). > You don't need to deal with SPI, just make the factories available in > classpath, Lucene finds them automatically. > > For loading resources, use Lucene's ResourceLoader, which gets passed to > the Factory's method inform() method. You only *need* to pass one, if and > only if the factory implements ResourceLoaderAware. There are several > ResourceLoaders available, Solr has its own very complicated one, but the > default Lucene ones are: ClasspathResourceLoader, FilesystemResourceLoader. > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > > Sent: Saturday, July 12, 2014 7:17 PM > > To: dev@lucene.apache.org > > Subject: Re: Hints on constructing/running Solr analyzer chains > standalone > > > > I don't want to read the schema.xml, but I do want to create factories > using > > the same parameters they use in schema. So, it looks like I need to play > > around with ResourceLoaders and maybe SPI loaders, so things like > wordlists > > get loaded. > > > > Starting from FieldAnalyzer turned out to be a dead-end because it was > using > > pre-initialized field definitions. But starting again from Test cases > seem to be > > somewhat more productive. > > > > The idea for the project is to give a web UI where a user can quickly > put one > > or more analyzer stacks together and see how it/they perform against text > > (multiple texts). A bit similar to FieldAnalyzer but allow to have > multiple > > stacks side-by-side and NOT needing to reload the core to add new ones. > > Then, generate the XML definition, ready for pasting in. That's the > target > > anyway. > > > > Regards, > >Alex. > > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: > > http://www.solr-start.com/ and @solrstart Solr popularizers community: > > https://www.linkedin.com/groups?gid=6713853 > > > > > > On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler wrote: > > > Hi, > > > > > > > > >> H, I think it's reasonably straightforward to construct what is > > >> implied by a Solr analysis chain in Lucene, would that do? Or do you > > >> want to read a schema.xml file outside Solr? > > >> > > >> If the former, then you can pretty much skip the Solr code entirely. > > > > > > Read this: > > > > > http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/pa > > > ckage-summary.html#package_description > > > > > > To do analysis, Solr is not needed at all, unless you want to read > > schema.xml files. If you want to do this, that is quite easy using the > > IndexSchema class. You can then get the analyzer from the field type or > field > > name. How to use the analyzer is described above and unrelated to Solr. > > > > > > Uwe > > > > > >> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch > > >> > > >> wrote: > > >> > Hello, > > >> > > > >> > I am interested in creating and running Solr analyzer chains > > >> > outside of normal process (no live Solr). Just construct a chain, > > >> > feed it tokens and see what happens. > > >> > > > >> > I would appreciate any hints on what that takes and whether there > > >> > are any hidden/weird dependencies (e.g. for resource
Re: Hints on constructing/running Solr analyzer chains standalone
I've been through all that code in Solr, and it sounds like you'd have to replicate its function. Wow, that's a truly ambitious task! Good Luck! I'm sure that a fair amount of it could be refactored dramatically to be a lot simpler since Solr evolved piecemeal over the years, but... that's another monumental task. And it would indeed be great to have a field type editor and field type API for the Solr Admin UI/API itself. As Uwe indicated, the factories are already in Lucene, so all you need to do is generate their parameters from the field type filter parameters. But... for a friendly development tool you would probably like a lot more friendly parameter checking and error reporting than the raw exceptions (and weak validation) found in the traditional Solr/Lucene factories. Again, a lot of that could be refactored since it has evolved over the years, but... that's another monumental task. Still, Solr would so much the better for it. And self-describing (and self-documenting) filter factories would be a fantastic improvement to Solr. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Saturday, July 12, 2014 1:16 PM To: dev@lucene.apache.org Subject: Re: Hints on constructing/running Solr analyzer chains standalone I don't want to read the schema.xml, but I do want to create factories using the same parameters they use in schema. So, it looks like I need to play around with ResourceLoaders and maybe SPI loaders, so things like wordlists get loaded. Starting from FieldAnalyzer turned out to be a dead-end because it was using pre-initialized field definitions. But starting again from Test cases seem to be somewhat more productive. The idea for the project is to give a web UI where a user can quickly put one or more analyzer stacks together and see how it/they perform against text (multiple texts). A bit similar to FieldAnalyzer but allow to have multiple stacks side-by-side and NOT needing to reload the core to add new ones. Then, generate the XML definition, ready for pasting in. That's the target anyway. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler wrote: Hi, H, I think it's reasonably straightforward to construct what is implied by a Solr analysis chain in Lucene, would that do? Or do you want to read a schema.xml file outside Solr? If the former, then you can pretty much skip the Solr code entirely. Read this: http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description To do analysis, Solr is not needed at all, unless you want to read schema.xml files. If you want to do this, that is quite easy using the IndexSchema class. You can then get the analyzer from the field type or field name. How to use the analyzer is described above and unrelated to Solr. Uwe On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch wrote: > Hello, > > I am interested in creating and running Solr analyzer chains outside > of normal process (no live Solr). Just construct a chain, feed it > tokens and see what happens. > > I would appreciate any hints on what that takes and whether there are > any hidden/weird dependencies (e.g. for resource discoveries). I tried > tracing through FieldAnalysis calls, but can't actually seem to find > the point where the actual analysis is done. Just getting lost in sets > of NamedList > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: > http://www.solr-start.com/ and @solrstart Solr popularizers community: > https://www.linkedin.com/groups?gid=6713853 > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For > additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Hints on constructing/running Solr analyzer chains standalone
Right, For the first cut, I am not planning to let people to edit things like synonyms files. Just select from a pre-existing list/dropdown. And I'll most probably start by providing a bunch of fixed stacks taken from example schemas. So, none of the Solr's flexibility is requited, just need to wire it all up correctly. One of the limiting factors is that the factories are NOT self-describing. So, I can't figure out what parameter is allowed, what form it takes and what it's description is. So, probably will have to hard-code that somewhere. And if that turns out to be too hard Well, let's just say I have a very long list of cool projects and am looking for the most impact for my time investment. But from digging into the sources yesterday, the backend looks quite doable. The front-end is - of course - always more of a challenge. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Sun, Jul 13, 2014 at 10:55 AM, Erick Erickson wrote: > Hmmm, sounds pretty cool! > > I wonder if it would be sufficient, for the first cut anyway, to let the user > specify whatever was necessary to bypass all the ResourceLoader stuff, > why make the user put the files in a place Solr knows about? Instead, for > development, it might be sufficient (and less error prone) to require them > to give the UI the information. > > Of course _you're_ the one doing the work, so whatever you think best. > > Erick > > On Sat, Jul 12, 2014 at 10:16 AM, Alexandre Rafalovitch > wrote: >> I don't want to read the schema.xml, but I do want to create factories >> using the same parameters they use in schema. So, it looks like I need >> to play around with ResourceLoaders and maybe SPI loaders, so things >> like wordlists get loaded. >> >> Starting from FieldAnalyzer turned out to be a dead-end because it was >> using pre-initialized field definitions. But starting again from Test >> cases seem to be somewhat more productive. >> >> The idea for the project is to give a web UI where a user can quickly >> put one or more analyzer stacks together and see how it/they perform >> against text (multiple texts). A bit similar to FieldAnalyzer but >> allow to have multiple stacks side-by-side and NOT needing to reload >> the core to add new ones. Then, generate the XML definition, ready for >> pasting in. That's the target anyway. >> >> Regards, >>Alex. >> Personal: http://www.outerthoughts.com/ and @arafalov >> Solr resources: http://www.solr-start.com/ and @solrstart >> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 >> >> >> On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler wrote: >>> Hi, >>> >>> H, I think it's reasonably straightforward to construct what is implied by a Solr analysis chain in Lucene, would that do? Or do you want to read a schema.xml file outside Solr? If the former, then you can pretty much skip the Solr code entirely. >>> >>> Read this: >>> http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description >>> >>> To do analysis, Solr is not needed at all, unless you want to read >>> schema.xml files. If you want to do this, that is quite easy using the >>> IndexSchema class. You can then get the analyzer from the field type or >>> field name. How to use the analyzer is described above and unrelated to >>> Solr. >>> >>> Uwe >>> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch wrote: > Hello, > > I am interested in creating and running Solr analyzer chains outside > of normal process (no live Solr). Just construct a chain, feed it > tokens and see what happens. > > I would appreciate any hints on what that takes and whether there are > any hidden/weird dependencies (e.g. for resource discoveries). I tried > tracing through FieldAnalysis calls, but can't actually seem to find > the point where the actual analysis is done. Just getting lost in sets > of NamedList>>> > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: > http://www.solr-start.com/ and @solrstart Solr popularizers community: > https://www.linkedin.com/groups?gid=6713853 > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For > additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org >>> >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> For additional commands, e-m
Re: Hints on constructing/running Solr analyzer chains standalone
Hmmm, sounds pretty cool! I wonder if it would be sufficient, for the first cut anyway, to let the user specify whatever was necessary to bypass all the ResourceLoader stuff, why make the user put the files in a place Solr knows about? Instead, for development, it might be sufficient (and less error prone) to require them to give the UI the information. Of course _you're_ the one doing the work, so whatever you think best. Erick On Sat, Jul 12, 2014 at 10:16 AM, Alexandre Rafalovitch wrote: > I don't want to read the schema.xml, but I do want to create factories > using the same parameters they use in schema. So, it looks like I need > to play around with ResourceLoaders and maybe SPI loaders, so things > like wordlists get loaded. > > Starting from FieldAnalyzer turned out to be a dead-end because it was > using pre-initialized field definitions. But starting again from Test > cases seem to be somewhat more productive. > > The idea for the project is to give a web UI where a user can quickly > put one or more analyzer stacks together and see how it/they perform > against text (multiple texts). A bit similar to FieldAnalyzer but > allow to have multiple stacks side-by-side and NOT needing to reload > the core to add new ones. Then, generate the XML definition, ready for > pasting in. That's the target anyway. > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler wrote: >> Hi, >> >> >>> H, I think it's reasonably straightforward to construct what is implied >>> by a Solr analysis chain in Lucene, would that do? Or do you want to read a >>> schema.xml file outside Solr? >>> >>> If the former, then you can pretty much skip the Solr code entirely. >> >> Read this: >> http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description >> >> To do analysis, Solr is not needed at all, unless you want to read >> schema.xml files. If you want to do this, that is quite easy using the >> IndexSchema class. You can then get the analyzer from the field type or >> field name. How to use the analyzer is described above and unrelated to Solr. >> >> Uwe >> >>> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch >>> wrote: >>> > Hello, >>> > >>> > I am interested in creating and running Solr analyzer chains outside >>> > of normal process (no live Solr). Just construct a chain, feed it >>> > tokens and see what happens. >>> > >>> > I would appreciate any hints on what that takes and whether there are >>> > any hidden/weird dependencies (e.g. for resource discoveries). I tried >>> > tracing through FieldAnalysis calls, but can't actually seem to find >>> > the point where the actual analysis is done. Just getting lost in sets >>> > of NamedList>> > >>> > Regards, >>> >Alex. >>> > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: >>> > http://www.solr-start.com/ and @solrstart Solr popularizers community: >>> > https://www.linkedin.com/groups?gid=6713853 >>> > >>> > - >>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For >>> > additional commands, e-mail: dev-h...@lucene.apache.org >>> > >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional >>> commands, e-mail: dev-h...@lucene.apache.org >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Hints on constructing/running Solr analyzer chains standalone
That sounds like a wonderful project, Alexandre — I’ve always wanted such a capability! I suggest approaching this very pragmatically based on minimizing the time to get something useful, which means leveraging as much as is available already — that means solr’s existing analysis UI screen. I suggest modifying the FieldAnalysisRequestHandler could take optional input of a provided XML fieldType definition in the request instead of using the live schema. It would create a new temporary SolrSchema based on the provided data, then re-use the rest of its field analyzing code based on that schema. Disclaimer: I have yet to look at FieldAnalysisRequestHandler. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, Jul 12, 2014 at 1:16 PM, Alexandre Rafalovitch wrote: > I don't want to read the schema.xml, but I do want to create factories > using the same parameters they use in schema. So, it looks like I need > to play around with ResourceLoaders and maybe SPI loaders, so things > like wordlists get loaded. > > Starting from FieldAnalyzer turned out to be a dead-end because it was > using pre-initialized field definitions. But starting again from Test > cases seem to be somewhat more productive. > > The idea for the project is to give a web UI where a user can quickly > put one or more analyzer stacks together and see how it/they perform > against text (multiple texts). A bit similar to FieldAnalyzer but > allow to have multiple stacks side-by-side and NOT needing to reload > the core to add new ones. Then, generate the XML definition, ready for > pasting in. That's the target anyway. > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > >
RE: Hints on constructing/running Solr analyzer chains standalone
The factories are part of Lucene, Solr is just using them. To list of available factories (in classpath) use (Tokenizer|TokenFilter|CharFilter)Factory.availableX() methods (to list all their names). You can invoke them using the corresponding forName() method and build an Analyzer from them. The latter has to be done manually, there is no general simple thing like Solr's chains. But that is quite easy to implement (if you really need an Analyzer instance). To just build a TokenStream for analysis, the factories is all you need (in fact Solr's chain just calls the factories in order... and returns it as TokenStreamComponents). You don't need to deal with SPI, just make the factories available in classpath, Lucene finds them automatically. For loading resources, use Lucene's ResourceLoader, which gets passed to the Factory's method inform() method. You only *need* to pass one, if and only if the factory implements ResourceLoaderAware. There are several ResourceLoaders available, Solr has its own very complicated one, but the default Lucene ones are: ClasspathResourceLoader, FilesystemResourceLoader. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Saturday, July 12, 2014 7:17 PM > To: dev@lucene.apache.org > Subject: Re: Hints on constructing/running Solr analyzer chains standalone > > I don't want to read the schema.xml, but I do want to create factories using > the same parameters they use in schema. So, it looks like I need to play > around with ResourceLoaders and maybe SPI loaders, so things like wordlists > get loaded. > > Starting from FieldAnalyzer turned out to be a dead-end because it was using > pre-initialized field definitions. But starting again from Test cases seem to > be > somewhat more productive. > > The idea for the project is to give a web UI where a user can quickly put one > or more analyzer stacks together and see how it/they perform against text > (multiple texts). A bit similar to FieldAnalyzer but allow to have multiple > stacks side-by-side and NOT needing to reload the core to add new ones. > Then, generate the XML definition, ready for pasting in. That's the target > anyway. > > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: > http://www.solr-start.com/ and @solrstart Solr popularizers community: > https://www.linkedin.com/groups?gid=6713853 > > > On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler wrote: > > Hi, > > > > > >> H, I think it's reasonably straightforward to construct what is > >> implied by a Solr analysis chain in Lucene, would that do? Or do you > >> want to read a schema.xml file outside Solr? > >> > >> If the former, then you can pretty much skip the Solr code entirely. > > > > Read this: > > > http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/pa > > ckage-summary.html#package_description > > > > To do analysis, Solr is not needed at all, unless you want to read > schema.xml files. If you want to do this, that is quite easy using the > IndexSchema class. You can then get the analyzer from the field type or field > name. How to use the analyzer is described above and unrelated to Solr. > > > > Uwe > > > >> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch > >> > >> wrote: > >> > Hello, > >> > > >> > I am interested in creating and running Solr analyzer chains > >> > outside of normal process (no live Solr). Just construct a chain, > >> > feed it tokens and see what happens. > >> > > >> > I would appreciate any hints on what that takes and whether there > >> > are any hidden/weird dependencies (e.g. for resource discoveries). > >> > I tried tracing through FieldAnalysis calls, but can't actually > >> > seem to find the point where the actual analysis is done. Just > >> > getting lost in sets of NamedList >> > > >> > Regards, > >> >Alex. > >> > Personal: http://www.outerthoughts.com/ and @arafalov Solr > resources: > >> > http://www.solr-start.com/ and @solrstart Solr popularizers > community: > >> > https://www.linkedin.com/groups?gid=6713853 > >> > > >> > --- > >> > -- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For > >> > additional commands, e-mail: dev-h...@lucene.apache.org > >> &g
Re: Hints on constructing/running Solr analyzer chains standalone
I don't want to read the schema.xml, but I do want to create factories using the same parameters they use in schema. So, it looks like I need to play around with ResourceLoaders and maybe SPI loaders, so things like wordlists get loaded. Starting from FieldAnalyzer turned out to be a dead-end because it was using pre-initialized field definitions. But starting again from Test cases seem to be somewhat more productive. The idea for the project is to give a web UI where a user can quickly put one or more analyzer stacks together and see how it/they perform against text (multiple texts). A bit similar to FieldAnalyzer but allow to have multiple stacks side-by-side and NOT needing to reload the core to add new ones. Then, generate the XML definition, ready for pasting in. That's the target anyway. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Sat, Jul 12, 2014 at 11:34 PM, Uwe Schindler wrote: > Hi, > > >> H, I think it's reasonably straightforward to construct what is implied >> by a Solr analysis chain in Lucene, would that do? Or do you want to read a >> schema.xml file outside Solr? >> >> If the former, then you can pretty much skip the Solr code entirely. > > Read this: > http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description > > To do analysis, Solr is not needed at all, unless you want to read schema.xml > files. If you want to do this, that is quite easy using the IndexSchema > class. You can then get the analyzer from the field type or field name. How > to use the analyzer is described above and unrelated to Solr. > > Uwe > >> On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch >> wrote: >> > Hello, >> > >> > I am interested in creating and running Solr analyzer chains outside >> > of normal process (no live Solr). Just construct a chain, feed it >> > tokens and see what happens. >> > >> > I would appreciate any hints on what that takes and whether there are >> > any hidden/weird dependencies (e.g. for resource discoveries). I tried >> > tracing through FieldAnalysis calls, but can't actually seem to find >> > the point where the actual analysis is done. Just getting lost in sets >> > of NamedList> > >> > Regards, >> >Alex. >> > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: >> > http://www.solr-start.com/ and @solrstart Solr popularizers community: >> > https://www.linkedin.com/groups?gid=6713853 >> > >> > - >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For >> > additional commands, e-mail: dev-h...@lucene.apache.org >> > >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional >> commands, e-mail: dev-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Hints on constructing/running Solr analyzer chains standalone
Hi, > H, I think it's reasonably straightforward to construct what is implied > by a Solr analysis chain in Lucene, would that do? Or do you want to read a > schema.xml file outside Solr? > > If the former, then you can pretty much skip the Solr code entirely. Read this: http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/analysis/package-summary.html#package_description To do analysis, Solr is not needed at all, unless you want to read schema.xml files. If you want to do this, that is quite easy using the IndexSchema class. You can then get the analyzer from the field type or field name. How to use the analyzer is described above and unrelated to Solr. Uwe > On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch > wrote: > > Hello, > > > > I am interested in creating and running Solr analyzer chains outside > > of normal process (no live Solr). Just construct a chain, feed it > > tokens and see what happens. > > > > I would appreciate any hints on what that takes and whether there are > > any hidden/weird dependencies (e.g. for resource discoveries). I tried > > tracing through FieldAnalysis calls, but can't actually seem to find > > the point where the actual analysis is done. Just getting lost in sets > > of NamedList > > > Regards, > >Alex. > > Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: > > http://www.solr-start.com/ and @solrstart Solr popularizers community: > > https://www.linkedin.com/groups?gid=6713853 > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For > > additional commands, e-mail: dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional > commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Hints on constructing/running Solr analyzer chains standalone
H, I think it's reasonably straightforward to construct what is implied by a Solr analysis chain in Lucene, would that do? Or do you want to read a schema.xml file outside Solr? If the former, then you can pretty much skip the Solr code entirely. FWIW, Erick On Sat, Jul 12, 2014 at 6:59 AM, Alexandre Rafalovitch wrote: > Hello, > > I am interested in creating and running Solr analyzer chains outside > of normal process (no live Solr). Just construct a chain, feed it > tokens and see what happens. > > I would appreciate any hints on what that takes and whether there are > any hidden/weird dependencies (e.g. for resource discoveries). I tried > tracing through FieldAnalysis calls, but can't actually seem to find > the point where the actual analysis is done. Just getting lost in sets > of NamedList > Regards, >Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Hints on constructing/running Solr analyzer chains standalone
Uhm. That's where I did start. :-( Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Sat, Jul 12, 2014 at 9:50 PM, Jack Krupansky wrote: > Tracing through indexing or query parsing is... a challenge. Start with > something simpler like the analysis admin API. > > See: > http://lucene.apache.org/solr/4_9_0/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html > > -- Jack Krupansky > > -Original Message- From: Alexandre Rafalovitch > Sent: Saturday, July 12, 2014 9:59 AM > To: dev@lucene.apache.org > Subject: Hints on constructing/running Solr analyzer chains standalone > > > Hello, > > I am interested in creating and running Solr analyzer chains outside > of normal process (no live Solr). Just construct a chain, feed it > tokens and see what happens. > > I would appreciate any hints on what that takes and whether there are > any hidden/weird dependencies (e.g. for resource discoveries). I tried > tracing through FieldAnalysis calls, but can't actually seem to find > the point where the actual analysis is done. Just getting lost in sets > of NamedList
Re: Hints on constructing/running Solr analyzer chains standalone
Tracing through indexing or query parsing is... a challenge. Start with something simpler like the analysis admin API. See: http://lucene.apache.org/solr/4_9_0/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Saturday, July 12, 2014 9:59 AM To: dev@lucene.apache.org Subject: Hints on constructing/running Solr analyzer chains standalone Hello, I am interested in creating and running Solr analyzer chains outside of normal process (no live Solr). Just construct a chain, feed it tokens and see what happens. I would appreciate any hints on what that takes and whether there are any hidden/weird dependencies (e.g. for resource discoveries). I tried tracing through FieldAnalysis calls, but can't actually seem to find the point where the actual analysis is done. Just getting lost in sets of NamedListhttp://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org