Re: maximum number of simultaneous threads

2013-05-14 Thread Dmitry Kan
venkata, If you are after search scaling, then the webapp server (like tomcat, jetty etc) handles allocation of threads per client connection (maxThreads for jetty for instance). Inside one client request SOLR uses threads for various tasks, but I don't have any exact figures (not sure if wiki has

Re: Issue with getting highlight with hl.maxAnalyzedChars = -1

2013-05-17 Thread Dmitry Kan
removing hl.maxAnalyzedChars= -1 > from search query. > > > Dmitry Kan-2 wrote > > You didn't say, what is exactly going weird.. > > > > > > On Fri, May 10, 2013 at 2:19 PM, meghana < > > > meghana.ravani@ > > > > wrote: > > > &

[custom data structure] aligned dynamic fields

2013-05-20 Thread Dmitry Kan
Hi all, Our current project requirement suggests that we should start storing custom data structures in solr index. The custom data structure would be an equivalent of C struct. The task is as follows. Suppose we have two types of fields, one is FieldName1 and the other FieldName2. Suppose also

Re: [custom data structure] aligned dynamic fields

2013-05-22 Thread Dmitry Kan
ers. If you start down a design path and find > that you are heavily dependent on dynamic fields and/or multi-valued fields > with large numbers of values per document, that is feedback that your > design needs to be denormalized and flattened further. > > -- Jack Krupansky >

Re: Documents

2013-06-07 Thread Dmitry Kan
hi, you need to parse your custom xml file and transform it into the xml file that will be of format solr understands. If you are familiar with xslt, you could do that in a few lines depending on the complexity of the input xml file. Dmitry On Fri, Jun 7, 2013 at 3:34 PM, wrote: > Good mornin

[CROSS-POSTING] SOLR-4903 and SOLR-4904

2013-06-07 Thread Dmitry Kan
CROSS-POSTING from dev list. Hi guys, As discussed with Grant and Andrzej I have created two jiras related to inefficiency in distributed faceting. This affects 3.4, but my gut feeling is telling me 4.x is affected as well. Regards, Dmitry Kan P.S. Asking this question won yours truly second

Re: Doubt Regarding Shards Index

2013-06-07 Thread Dmitry Kan
Hi, Sharding by time by itself does not need any custom code on solr side: start indexing your data to a shard, depending on the timestamp of your document. The querying part is trickier if you want to have one front end solr: it should know which shards to query. If querying all shards for each

Re: retrieve datefield value from document

2013-06-14 Thread Dmitry Kan
Maybe a document was marked as deleted? *isDeleted * On Fri, Jun 14, 2013 at 11:25 PM, Michael Della Bitta < michael.della.bi...@appinions.com> wrote: > Shot in the dark: > > You're using Lucene

solr 1.4 facet.limit behaviour in merging from several shards

2011-09-02 Thread Dmitry Kan
y does the merging SOLR combine the results from shard, when they exceed the facet.limit? Please ask questions, if something isn't clear or you need more details. Thanks, Dmitry Kan

solr 1.4 highlighting issue

2011-09-13 Thread Dmitry Kan
isubmersibles and drillships) are 21 deepwater <em>drilling</em> Why did solr highlight "drilling" even though there is no "ships" in the text? * *-- Regards, Dmitry Kan

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-09-14 Thread Dmitry Kan
Hi Chris, Thanks for taking this. Sorry for my confusing explanation. Since you requested a bigger picture, I'll give some more detail. In short: we don't do date facets, and sorting by date in reverse order happens naturally by design. All the data is split to shards. We use logical sharding, no

Re: Out of memory

2011-09-14 Thread Dmitry Kan
Hi Rohit, Do you use caching? How big is your index in size on the disk? What is the stack trace contents? The OOM problems that we have seen so far were related to the index physical size and usage of caching. I don't think we have ever found the exact cause of these problems, but sharding has h

Re: Out of memory

2011-09-14 Thread Dmitry Kan
> > Regards, > Rohit > > > -Original Message- > From: Dmitry Kan [mailto:dmitry@gmail.com] > Sent: 14 September 2011 08:15 > To: solr-user@lucene.apache.org > Subject: Re: Out of memory > > Hi Rohit, > > Do you use caching? > How big is your index in si

Re: solr 1.4 highlighting issue

2011-09-14 Thread Dmitry Kan
. Dmitry On Wed, Sep 14, 2011 at 2:20 PM, Koji Sekiguchi wrote: > (11/09/14 15:54), Dmitry Kan wrote: > >> Hello list, >> >> Not sure how many of you are still using solr 1.4 in production, but here >> is >> an issue with highlighting, that we&#x

Re: solr 1.4 highlighting issue

2011-09-14 Thread Dmitry Kan
The whole document should satisfy the query (ie it probably > has ships/s somewhere else in it), but each snippet won't generally have all > the terms. > > -Mike > > > On 9/14/2011 2:54 AM, Dmitry Kan wrote: > >> Hello list, >> >> Not sure how many

Re: Out of memory

2011-09-15 Thread Dmitry Kan
> Rohit > > > -Original Message- > From: Dmitry Kan [mailto:dmitry@gmail.com] > Sent: 14 September 2011 10:23 > To: solr-user@lucene.apache.org > Subject: Re: Out of memory > > Hi, > > OK 64GB fits into one shard quite nicely in our setup. But I have neve

Re: Out of memory

2011-09-15 Thread Dmitry Kan
wrote: > It's happening more in search and search has become very slow particularly > on the core with 69GB index data. > > Regards, > Rohit > > -Original Message- > From: Dmitry Kan [mailto:dmitry@gmail.com] > Sent: 15 September 2011 07:51 > To: s

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-09-23 Thread Dmitry Kan
/#xyproblem > : XY Problem > : > : Your question appears to be an "XY Problem" ... that is: you are dealing > : with "X", you are assuming "Y" will help you, and you are asking about > "Y" > : without giving more details about the "X" so that we can understand the > : full issue. Perhaps the best solution doesn't involve "Y" at all? > : See Also: http://www.perlmonks.org/index.pl?node_id=542341 > > > -Hoss > -- Regards, Dmitry Kan

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-10-06 Thread Dmitry Kan
Hello, I'm now in the process of migrating solr 1.4 -> solr 3.4. It is done already, just wide scale testing remains. I'll report back if anything pops up related to the jira ticket. Otherwise I could work closer on the issue, unless it is fixed. Dmitry On Wed, Oct 5, 2011 at 4:24 AM, Yonik Seel

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-10-06 Thread Dmitry Kan
Thanks Hoss, I can look into that, once done with solr router migration 1.4->3.4. Dmitry On Wed, Oct 5, 2011 at 2:13 AM, Chris Hostetter wrote: > > : OK, if SOLR-2403 being related to the bug I described, has been fixed in > : SOLR 3.4 than we are safe, since we are in the process of migration.

wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
r-cased to ocvd. Does SOLR skip a lower-casing step when doing the actual wild-card search? BTW, the same issue for a trailing wild-card: mocv* produces hits, while MOCV* doesn't. Appreciate any help or pointers. -- Regards, Dmitry Kan

Re: wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
> > > *OCVD > > > > it doesn't > > This is a FAQ. Please see > > > http://wiki.apache.org/lucene-java/LuceneFAQ#Are_Wildcard.2C_Prefix.2C_and_Fuzzy_queries_case_sensitive.3F > -- Regards, Dmitry Kan

Re: wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
ps://issues.apache.org/jira/browse/SOLR-218 > You can vote this issue. For the time being you can lowercase them in the > client side. > -- Regards, Dmitry Kan

Re: wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
ueryParser extends that and default behavior may different. For > clarification see source code of SolrQueryParser. > -- Regards, Dmitry Kan

Re: wild card search and lower-casing

2011-11-20 Thread Dmitry Kan
Thanks Erick. Do you think the patch you are working on will be applicable as well to 3.4? Best, Dmitry On Mon, Nov 21, 2011 at 5:06 AM, Erick Erickson wrote: > As it happens I'm working on SOLR-2438 which should address this. This > patch > will provide two things: > > The ability to define a

Re: wild card search and lower-casing

2011-11-22 Thread Dmitry Kan
l have to change it after > applying > the patch for this to work for you. Should be trivial, I'll leave a note > in the > code about this, look for SOLR-2438 in the 3x code line for the place > to change. > > On Mon, Nov 21, 2011 at 2:14 AM, Dmitry Kan wrote: > > T

Re: wild card search and lower-casing

2011-11-22 Thread Dmitry Kan
to pre 3.6 code. It would be a good field test > if that worked for you. > > But you can't do any of this until the JIRA (SOLR-2438) is > marked "Resolution: Fixed". > > Don't be fooled by "Fix Version". "Fix Version" simply says > that those

Re: Search on multiple fields is not working

2011-11-23 Thread Dmitry Kan
for tagName:"MUSIC" AND > "DESIGNER".The results are not containing profileId 99964 and 10076. > > Can anybody tell what i am doing wrong? > > Regards, > Siva > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Search-on-multiple-fields-is-not-working-tp3530145p3530145.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Dmitry Kan

Re: wild card search and lower-casing

2011-11-23 Thread Dmitry Kan
tuation though. > > Best > Erick > > On Tue, Nov 22, 2011 at 10:46 AM, Dmitry Kan wrote: > > Thanks, Erick. I was in fact reading the patch (the one attached as a > > file to the aforementioned jira) you updated sometime yesterday. I'll > > watch the issue, b

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Dmitry Kan
Hello, Is this log from the frontend SOLR (aggregator) or from a shard? Can you merge, e.g. 3 shards together or is it much effort for your team? In our setup we currently have 16 shards with ~30GB each, but we rarely search in all of them at once. Best, Dmitry On Wed, Nov 23, 2011 at 3:12 PM,

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Dmitry Kan
> > > Can you merge, e.g. 3 shards together or is it much effort for your team? > Yes, we can merge. We'll try to do this and review how it will works > Thanks, Dmitry > > Any another ideas? > > On Wed, Nov 23, 2011 at 4:01 PM, Dmitry Kan wrote: > > He

Re: Huge Performance: Solr distributed search

2011-11-25 Thread Dmitry Kan
at 4:38 PM, Artem Lokotosh wrote: > >> Is this log from the frontend SOLR (aggregator) or from a shard? > > from aggregator > > > >> Can you merge, e.g. 3 shards together or is it much effort for your > team? > > Yes, we can merge. We'll try to do this an

cache monitoring tools?

2011-12-06 Thread Dmitry Kan
00221 schema excerpt: -- Regards, Dmitry Kan

Re: Solr request handler queries in fiddler

2011-12-06 Thread Dmitry Kan
help me out > with this. > > Thank u in advance > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-request-handler-queries-in-fiddler-tp3564260p3564260.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Dmitry Kan

Re: cache monitoring tools?

2011-12-07 Thread Dmitry Kan
crease the maxsize > value to your acceptable limit. > > Regards > Pravesh > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/cache-monitoring-tools-tp3566645p3566811.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Dmitry Kan

Re: Solr request handler queries in fiddler

2011-12-07 Thread Dmitry Kan
> > > > On Wed, Dec 7, 2011 at 12:54 PM, Dmitry Kan [via Lucene] < > ml-node+s472066n351...@n3.nabble.com> wrote: > > > If you mean debugging the queries, you can use eclipse+jetty plugin setup > > ( > > http://code.google.com/p/run-jetty-run/) with

Re: cache monitoring tools?

2011-12-07 Thread Dmitry Kan
tool. See > http://wiki.apache.org/solr/SolrJmx > > On Wed, Dec 7, 2011 at 6:19 AM, Dmitry Kan wrote: > > > Yes, we do require that much. > > Ok, thanks, I will try increasing the maxsize. > > > > On Wed, Dec 7, 2011 at 10:56 AM, pravesh wrote: > > > &

Re: cache monitoring tools?

2011-12-07 Thread Dmitry Kan
The culprit seems to be the merger (frontend) SOLR. Talking to one shard directly takes substantially less time (1-2 sec). On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan wrote: > Tomás: thanks. The page you gave didn't mention cache specifically, is > there more documentation on this s

Re: cache monitoring tools?

2011-12-07 Thread Dmitry Kan
idworks-with-zabbix/ > > On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan wrote: > > > The culprit seems to be the merger (frontend) SOLR. Talking to one shard > > directly takes substantially less time (1-2 sec). > > > > On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan wrot

Re: cache monitoring tools?

2011-12-12 Thread Dmitry Kan
r the wire, multiplied by the number of shards, multipled by some > constant (i think it's 2 but it might be higher) in order to "over > request" facet constriant counts from each shard to aggregate them. > > the dominant factor in the slow speed you are seeing is most likeley > Network IO between the shards. > > > > -Hoss > -- Regards, Dmitry Kan

Re: cache monitoring tools?

2011-12-12 Thread Dmitry Kan
graphs of loads, e.g. cache counts or CPU load, in > parallel to a console log or to an http request log?? > > I am working on such a tool currently but I have a bad feeling of > reinventing the wheel. > > thanks in advance > > Paul > > > > Le 8 déc. 2011 à 08:53, D

Re: cache monitoring tools?

2011-12-12 Thread Dmitry Kan
he counts or CPU > > load, in parallel to a console log or to an http request log?? > > > > I am working on such a tool currently but I have a bad feeling of > reinventing the wheel. > > > > thanks in advance > > > > Paul > > > > > &g

Re: Virtual Memory very high

2011-12-13 Thread Dmitry Kan
If you allow me to chime in, is there a way to check for which DirectoryFactory is in use, if ${solr.directoryFactory:solr.StandardDirectoryFactory} has been configured? Dmitry 2011/12/12 Yury Kats > On 12/11/2011 4:57 AM, Rohit wrote: > > What are the difference in the different DirectoryFacto

Re: NumericRangeQuery: what am I doing wrong?

2011-12-14 Thread Dmitry Kan
Maybe you should index your values differently? Here is what Lucene's 2.9 javadoc says: To use this, you must first index the numeric values using NumericField(expert: NumericTokenStream

Re: cache monitoring tools?

2011-12-14 Thread Dmitry Kan
rvers with replicated > indexes that handle the queries, while our master handles > updates/commits. > > Justin > > Dmitry Kan writes: > > > Justin, in terms of the overhead, have you noticed if Munin puts much of > it > > when used in production? In terms

Re: disable stemming on query parser.

2011-12-16 Thread Dmitry Kan
You can disable stemming in a copy field. So you need to define one field with your input data on which stemming will be done and the other field (copy field), on which stemming will not be done. Then on the client you can decide which field to search against. Dmitry On Fri, Dec 16, 2011 at 2:00

Re: disable stemming on query parser.

2011-12-19 Thread Dmitry Kan
.com/disable-stemming-on-query-parser-tp3591420p3597597.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Dmitry Kan

Re: cache monitoring tools?

2011-12-19 Thread Dmitry Kan
t; > Justin > > 1. http://exchange.munin-monitoring.org/plugins/search?keyword=solr > > Dmitry Kan writes: > > > Thanks, Justin. With zabbix I can gather jmx exposed stats from SOLR, how > > about munin, what protocol / way it uses to accumulate stats? It wasn't > > o

a question on jmx solr exposure

2011-12-21 Thread Dmitry Kan
going to see under solr/ ? >From the numbers (e.g. numDocs of searcher), jconsole see the stats of A. Where do stats of B go? Or is firstly activated core will capture the jmx "pipe" and won't let B's stats to go through? -- Regards, Dmitry Kan

Re: a question on jmx solr exposure

2011-12-21 Thread Dmitry Kan
Solved by exposing jmx only on one of the cores, as it is of a more interest than the other one. Dmitry On Wed, Dec 21, 2011 at 11:56 AM, Dmitry Kan wrote: > Hello list, > > This might be not the right place to ask the jmx specific questions, but I > decided to try, as we are

Re: [Announce] Solr 3.5 with RankingAlgorithm 1.3, NRT support

2011-12-27 Thread Dmitry Kan
Hello Nagendra, Congratulations on the new release! In terms of downloading: does one need to be registered on the site do download the bundle? The download links lead to http://solr-ra.tgels.org/solr-ra.jsp. Regards, Dmitry Kan On Tue, Dec 27, 2011 at 4:30 PM, Nagendra Nagarajayya

distributed faceting: refineFacets()

2011-12-29 Thread Dmitry Kan
spond with non-intersecting hits. That practically means, that the merger should simply "concatenate" the shard results into one list (automatically pre-sorted by design). Can something be improved in the SOLR merger facet logic here? Should we look at something else as well? -- Thanks, Dmitry Kan

Re: a question on jmx solr exposure

2011-12-29 Thread Dmitry Kan
on3 > ... > > On Wed, Dec 21, 2011 at 1:56 PM, Dmitry Kan wrote: > > Hello list, > > > > This might be not the right place to ask the jmx specific questions, but > I > > decided to try, as we are polling SOLR statistics through jmx. > > > > We cu

Re: a question on jmx solr exposure

2011-12-29 Thread Dmitry Kan
That's absolutely right. Thanks for the suggestion. On Thu, Dec 29, 2011 at 2:47 PM, Gora Mohanty wrote: > On Thu, Dec 29, 2011 at 6:15 PM, Dmitry Kan wrote: > > Well, we don't use multicore feature of SOLR, so in our case SOLR > instances > > are just separate w

Re: best way to force substitutions in data

2012-01-10 Thread Dmitry Kan
> > -- > View this message in context: > http://lucene.472066.n3.nabble.com/best-way-to-force-substitutions-in-data-tp3646195p3646195.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Dmitry Kan

FacetComponent: suppress original query

2012-01-12 Thread Dmitry Kan
submitted too. Is there a way of suppressing the original query? -- Regards, Dmitry Kan

Re: Facets, Get top 10 categories

2012-01-13 Thread Dmitry Kan
uot;. > > Thanks, > Manish. > -- Regards, Dmitry Kan

Re: Facets, Get top 10 categories

2012-01-13 Thread Dmitry Kan
ny calculation > based on resultset count. > > On Fri, Jan 13, 2012 at 5:44 PM, Dmitry Kan wrote: > > You could do this on the client side, just read 10 first facets off the > top > > of the list and mark the remaining as "Others". > > > > On Fri, Jan 13, 2

Re: FacetComponent: suppress original query

2012-01-13 Thread Dmitry Kan
Thu, Jan 12, 2012 at 4:49 PM, Dmitry Kan wrote: > Hello list, > > I need to split the incoming original facet query into a list of > sub-queries. The logic is done and each sub-query gets added into outgoing > queue with rb.addRequest(), where rb is instance of ResponseBuilder. >

Re: FacetComponent: suppress original query

2012-01-14 Thread Dmitry Kan
worked > arround the maxBooleanClauses (ie: "That part is done") but you didn't say > how, and in your followup quesiton, it sounds like you are still hitting > the limit of maxBooleanClauses. > > So what exactly have you changed/done that is "done" and what is the > new problem? > > > -Hoss > -- Regards, Dmitry Kan

Re: FacetComponent: suppress original query

2012-01-17 Thread Dmitry Kan
Yes, that's what I have started to use already. Probably, this is the easiest solution. Thanks. On Tue, Jan 17, 2012 at 3:03 AM, Erick Erickson wrote: > Why not just up the maxBooleanClauses parameter in solrconfig.xml? > > Best > Erick > > On Sat, Jan 14, 2012 at 1:4

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Dmitry Kan
ion: Should I reduce the documents in one shard, so that the index is > equal or less the Java Heap size for this shard? Or is > there another method to avoid this slow calls? > > Thank you > > Daniel > -- Regards, Dmitry Kan

Re: really slow performance when trying to get facet.field

2012-01-17 Thread Dmitry Kan
autowarmCount="0"/> > > How big was your index? Did it fit into the RAM which you gave the Solr > instance? > > Thanks > > > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan wrote: > > > I had a similar problem for a similar task. And in my case

Re: really slow performance when trying to get facet.field

2012-01-18 Thread Dmitry Kan
15mio per shard and see what will happen here. > > > > This thing is, that I will add more shards over time, so that I can > handle > > maybe 500-800mio documents. Maybe more. It depends. > > > > On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan > wrote: > > > >>

Re: Question on Reverse Indexing

2012-01-18 Thread Dmitry Kan
; still working inspite of disabling the ReversedWildcardFilterFactory filter. > > > > > > This behavior is puzzling everyone and wanted to know how this behavior > of reverse indexing works? > > > > Can anyone share with me on this Solr behavior. > > > > -Shyam > > > > -- Regards, Dmitry Kan

Re: Question on Reverse Indexing

2012-01-18 Thread Dmitry Kan
; Dimitry, > > Using http://localhost:7070/solr/docs/admin/analysis.jsp passed the query > *lock and did not find ReversedWildcardFilterFactory to the indexer or any > other filters that could do the reversing. > > -Shyam > > -----Original Message- > From: Dmitry Kan

Re: Question on Reverse Indexing

2012-01-18 Thread Dmitry Kan
words="stopwords.txt" ignoreCase="true"/> > > > > But when it was found that ReversedWildcardFilterFactory is adding > performance burden we removed the ReversedWildcardFilterFactory filter > withOriginal="true"

Re: ReversedWildcardFilterFactory Question

2012-01-18 Thread Dmitry Kan
h a text field and a text_rev field, or is it sufficient to just > index the information into a text_rev field? I *think* that it only > needs to be in text_rev, but I want to make sure before I go mucking > with my schema. > -- Regards, Dmitry Kan

Re: Question on Reverse Indexing

2012-01-19 Thread Dmitry Kan
am Bhaskaran [mailto:shyam.bhaska...@synopsys.com] > Sent: Thursday, January 19, 2012 6:29 AM > To: solr-user@lucene.apache.org > Subject: RE: Question on Reverse Indexing > > Dimitry, > > Completed a clean index and I still see the same behavior. > > Did not use Luke but fr

Re: Question on Reverse Indexing

2012-01-19 Thread Dmitry Kan
>stored="true" multiValued="true" termVectors="true" termPositions="true" > termOffsets="true" /> >multiValued="false" /> >stored="true" multiValued="true" /> > > > > Excerpt

Re: Question on Reverse Indexing

2012-01-20 Thread Dmitry Kan
FilterFactory I can see the size of > indexes increase but when I remove the filter the size decreases but in > either case I am seeing the leading wild card query working. > > > -Shyam > > -Original Message- > From: Dmitry Kan [mailto:dmitry@gmail.com] > Se

Re: Size of index to use shard

2012-01-24 Thread Dmitry Kan
Hi, The article you gave mentions 13GB of index size. It is quite small index from our perspective. We have noticed, that at least solr 3.4 has some sort of "choking" point with respect to growing index size. It just becomes substantially slower than what we need (a query on avg taking more than 3

Re: Size of index to use shard

2012-01-26 Thread Dmitry Kan
u should periodically > >> re-test the empirical numbers you *do* arrive at... > >> > >> Best > >> Erick > >> > >> On Tue, Jan 24, 2012 at 5:31 AM, Anderson vasconcelos > >> wrote: > >>> Apparently, not so easy to determine wh

Re: error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-12-09 Thread Dmitry Kan
you test and find any. Regards, Dmitry Kan On Fri, Dec 7, 2012 at 5:50 PM, Neil Ireson wrote: > In case it is of use, I have just uploaded an updated and mavenised > version of the Luke code to the Luke discussion list, see > https://groups.google.com/d/**topic/luke-discuss/MNT_

Re: Question on WordDelimiterFilterFactory use

2012-12-26 Thread Dmitry Kan
Hi, Have you tried looking at admin analysis page? You can see how i-pod gets indexed and highlight query results there too. Best, Dmitry Kan On Wed, Dec 26, 2012 at 10:08 AM, Jose Yadao wrote: > Hi and Happy Holidays to everyone. > > I have a question regarding t

Re: using PositionIncrementAttribute to increment certain term positions to large values

2012-12-27 Thread Dmitry Kan
field. As for the performance, no major delays compared to the original proximity search implementation have been noticed. Best, Dmitry Kan On Wed, Dec 19, 2012 at 10:53 AM, Dmitry Kan wrote: > Dear list, > > We are currently evaluating proximity searches ("term1 term2" ~slop

Re: Which token filter can combine 2 terms into 1?

2012-12-27 Thread Dmitry Kan
Hi, Have a look onto TokenFilter. Extending it will give you access to a TokenStream. Regards, Dmitry Kan On Fri, Dec 21, 2012 at 9:05 AM, Xi Shen wrote: > Hi, > > I am looking for a token filter that can combine 2 terms into 1? E.g. > > the input has been tokenized by white s

Re: regex and highlighter component: highlight and return individual fragments inside a snippet

2013-01-14 Thread Dmitry Kan
ngParameters#hl.snippets> Dmitry On Mon, Jan 14, 2013 at 3:14 PM, Dmitry Kan wrote: > Hello! > > I'm playing with the regex feature of highlighting in SOLR. The regex I > have is pretty simple and, given a keyword query, it hits in a few places > inside each document. &

Re: access matched token ids in the FacetComponent?

2013-01-18 Thread Dmitry Kan
2013 at 2:08 PM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Dmitry, > > I have some relevant experience and ready to help, but I can not get the > core problem. Could you please expand the description and/or provide a > sample? > > > On Tue, Jan 15, 2013 at

Re: using PositionIncrementAttribute to increment certain term positions to large values

2013-01-18 Thread Dmitry Kan
Hi, For the sake of story completeness, I was able to fix the highlighter to work with the token matches that go beyond the length of the text field. The solution was to mod on matched token positions, if they exceed the length of the text. Dmitry On Thu, Dec 27, 2012 at 10:13 AM, Dmitry Kan

Re: access matched token ids in the FacetComponent?

2013-01-18 Thread Dmitry Kan
a word in a sentence to 1 > - play with facet by function patch > https://issues.apache.org/jira/browse/SOLR-1581 accomplished by tf() > function. > > It doesn't seem like much help. > > On Fri, Jan 18, 2013 at 12:42 PM, Dmitry Kan wrote: > > > that we actually require

SOLR 4.x: multiterm phrase inside proximity searches possible?

2013-01-18 Thread Dmitry Kan
Hello! Does SOLR 4.x support / is going to support the multi-term phrase search inside proximity searches? To illustrate, we would like the following to work: "\a b\" c"~10 which would return hits with "a b" 10 tokens away from c in no particular order. It looks like https://issues.apache.org/

Re: SOLR 4.x: multiterm phrase inside proximity searches possible?

2013-01-18 Thread Dmitry Kan
r/**SurroundQueryParser<http://wiki.apache.org/solr/SurroundQueryParser> > > But, surround does not support regex terms, just wildcards. > > -- Jack Krupansky > > -Original Message- From: Dmitry Kan > Sent: Friday, January 18, 2013 8:59 AM > To: solr-user@luce

Re: SOLR 4.x: multiterm phrase inside proximity searches possible?

2013-01-18 Thread Dmitry Kan
n 18, 2013 at 4:44 PM, Jack Krupansky wrote: > Unfortuntaely, yes. > > > -- Jack Krupansky > > -Original Message- From: Dmitry Kan > Sent: Friday, January 18, 2013 9:42 AM > To: solr-user@lucene.apache.org > Subject: Re: SOLR 4.x: multiterm phrase inside proximity

Re: SOLR 4.x: multiterm phrase inside proximity searches possible?

2013-01-18 Thread Dmitry Kan
Yep, that's my issue: we still use solr 3.4. On Fri, Jan 18, 2013 at 4:57 PM, Jack Krupansky wrote: > LUCENE-2754 is already in Lucene 4.0 - SpanMultiTermQueryWrapper. > > > -- Jack Krupansky > > -Original Message- From: Dmitry Kan > Sent: Friday, January 18,

SOLR-1604

2013-01-18 Thread Dmitry Kan
Hello! Is there some activity on SOLR-1604? Can one of the contributors answer two simple questions? https://issues.apache.org/jira/browse/SOLR-1604?focusedCommentId=13557053&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13557053 Regards, Dmitry

Re: access matched token ids in the FacetComponent?

2013-01-21 Thread Dmitry Kan
is BooleanScorer2 (but not BooleanScorer!), you can access > the SpanQueryScorer in one of the legs and try to access the matched spans > - if you are in 3.x you'll have a problem with disjunction queries. > > it seems challenging, doesn't it? > > 18.01.2013 17:40 пользова

Re: access matched token ids in the FacetComponent?

2013-01-21 Thread Dmitry Kan
or will accept the scorer instance > - if this scorer is BooleanScorer2 (but not BooleanScorer!), you can access > the SpanQueryScorer in one of the legs and try to access the matched spans > - if you are in 3.x you'll have a problem with disjunction queries. > > it seems challe

Re: access matched token ids in the FacetComponent?

2013-01-23 Thread Dmitry Kan
;64К docs with 5-20 span positions per each. Search result length 100-2000 > docs with 3-5 facet fields. It shows 100 q/sec on an average datacenter > box." > > > On Mon, Jan 21, 2013 at 5:23 PM, Dmitry Kan wrote: > > > Mikhail, > > > > Thanks for the guidance!

Re: Hi

2013-01-24 Thread Dmitry Kan
(start-off-topic): Alexandre, nice ideas. Last in the *) list is a bit far stretched, but still good. I would still add one: how to have exact matches and inexact matches in the same analyzed field. (end-off-topic) On Wed, Jan 23, 2013 at 2:40 PM, Alexandre Rafalovitch wrote: > We need a "Make yo

Re: long QTime for big index

2013-01-31 Thread Dmitry Kan
Does debugQuery=true tell anything useful for these? Like what is the component taking most of the 30 seconds. Do you have evictions in your solr caches? Dmitry On Thu, Jan 31, 2013 at 10:01 AM, Mou wrote: > I am running solr 3.4 on tomcat 7. > > Our index is very big , two cores each 120G. We

Re: MockAnalyzer in Lucene: attach stemmer or any custom filter?

2013-02-15 Thread Dmitry Kan
zer. > This is the way all lucene's own analysis tests work: e.g. > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/en/TestEnglishMinimalStemFilter.java > > On Thu, Feb 14, 2013 at 7:40 AM, Dmitry Kan wrote: > &g

Re: MockAnalyzer in Lucene: attach stemmer or any custom filter?

2013-02-15 Thread Dmitry Kan
Amazing. Thanks! On Fri, Feb 15, 2013 at 7:07 PM, Robert Muir wrote: > For 3.4, extend ReusableAnalyzerBase > > On Fri, Feb 15, 2013 at 12:06 PM, Dmitry Kan wrote: > > Thanks a lot, Robert. > > > > I need to study a bit more closely the link you have sent. I have

Re: WIKI: Does JSON Update format actually support single-object submit?

2013-02-19 Thread Dmitry Kan
To clarify a bit: > I did a quick test with my example and it seemed to fail with [] > but passing with []. did you mean to use {} in one of these? Dmitry On Sun, Feb 17, 2013 at 4:22 AM, Alexandre Rafalovitch wrote: > I am looking at the Solr WIKI and some of the examples seem to contradict >

Re: WIKI: Does JSON Update format actually support single-object submit?

2013-02-20 Thread Dmitry Kan
nts from happening all at > once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) > > > On Tue, Feb 19, 2013 at 3:11 AM, Dmitry Kan wrote: > > > To clarify a bit: > > > > > I did a quick test with my example and it seemed to fail with []

Re: Search returns hits but highlighting does not work for certain field

2013-03-04 Thread Dmitry Kan
Can you also show, how you define a field rawData in the schema? Dmitry On Mon, Mar 4, 2013 at 4:13 PM, Van Tassell, Kristian < kristian.vantass...@siemens.com> wrote: > Does anyone have any ideas? I don't understand how the query can match, as > I am querying against the same field, and yet get

Re: Bulk word document indexing

2013-03-05 Thread Dmitry Kan
Hello, Look towards Tika. It can handle these MS Word file formats: http://tika.apache.org/1.3/formats.html#Microsoft_Office_document_formats Solr Wiki: http://wiki.apache.org/solr/ExtractingRequestHandler I don't have a link for a tutorial with example schemas. Dmitry On Tue, Mar 5, 2013 at

Re: Bulk word document indexing

2013-03-05 Thread Dmitry Kan
Probably, the bulk indexing feature is not implemented for tika processing, but you can easily compile a script yourself: Extract in a loop over the word files in a directory: curl " http://localhost:8983/solr/update/extract?literal.id=doc5&defaultField=text"; --data-binary @tutorial.html -H 'Co

Re: access matched token ids in the FacetComponent?

2013-03-05 Thread Dmitry Kan
12:47 PM, Dmitry Kan wrote: > Thanks Alexandre for correcting the link and Mikhail for sharing the ideas! > > Mihkail, > > I will need to look closer at your customization of SpansFacetComponent on > the blogpost. > Is it so, that in this component, you are accessing and co

SOLR-2703: spanNOT supported?

2013-03-06 Thread Dmitry Kan
Hello, is spanNOT operator supported in this patch? If not, is there a need for this feature for anyone? Regards, Dmitry

<    1   2   3   4   5   6   >