DocValued SortableText Field is slower than Non DocValued String Field for Facet

2021-01-28 Thread Jae Joo
I am wondering that the performance of facet of DocValued SortableText
Field is slower than non Docvalued String Field.

Does anyone know why?


Thanks,

Jae


Replicaton SolrCloud

2021-01-15 Thread Jae Joo
Is non CDCR replication in SolrCloud still working in Solr 9.0?

Jae


HugePage Solr

2020-12-16 Thread Jae Joo
Does anyone have the experience to use HugePage and UseLargePage?  How much
can we get the performance benefits from utilizing it?
The disk is NOT SSD and the sole node has 256 GB.  The Heap is 31.99 G.

Thanks,

Jae


Re: Function Query Optimization

2020-12-14 Thread Jae Joo
Should SubQuery be faster than FunctionQuery?

On Sat, Dec 12, 2020 at 10:24 AM Vincenzo D'Amore 
wrote:

> Hi, looking at this sample it seems you have just one document for '12345',
> one for '23456' and so on so forth. If this is true, why don't just try
> with a subquery
>
> https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_
>
> On Fri, Dec 11, 2020 at 3:31 PM Jae Joo  wrote:
>
> > I have the requirement to create field  - xyz to be returned based on the
> > matched result.
> > Here Is the code .
> >
> > XYZ:concat(
> >
> > if(exists(query({!v='field1:12345'})), '12345', ''),
> >
> > if(exists(query({!v='field1:23456'})), '23456', ''),
> >
> > if(exists(query({!v='field1:34567'})), '34567', ''),
> >
> > if(exists(query({!v='field:45678'})), '45678','')
> > ),
> >
> > I am feeling this is very complex, so I am looking for some smart and
> > faster ideas.
> >
> > Thanks,
> >
> > Jae
> >
>
>
> --
> Vincenzo D'Amore
>


Function Query Optimization

2020-12-11 Thread Jae Joo
I have the requirement to create field  - xyz to be returned based on the
matched result.
Here Is the code .

XYZ:concat(

if(exists(query({!v='field1:12345'})), '12345', ''),

if(exists(query({!v='field1:23456'})), '23456', ''),

if(exists(query({!v='field1:34567'})), '34567', ''),

if(exists(query({!v='field:45678'})), '45678','')
),

I am feeling this is very complex, so I am looking for some smart and
faster ideas.

Thanks,

Jae


Re: facet.method=smart

2020-12-04 Thread Jae Joo
Thanks!

Jae

On Fri, Dec 4, 2020 at 1:38 AM Radu Gheorghe 
wrote:

> Hi Jae,
>
> No, it’s not smarter than explicitly defining, for example enum for a
> low-cardinality field.
>
> Think of “smart” as a default path, and explicit definitions as some
> “hints”. You can see that default path in this function:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/facet/FacetField.java#L74
>
> Note that I’ve added a PR with a bit more explanations for the “hits”
> here: https://github.com/apache/lucene-solr/pull/2057 But if you’re
> missing some info, please feel free to comment (here or there), I could add
> some more info.
>
> Best regards,
> Radu
> --
> Sematext Cloud - Full Stack Observability - https://sematext.com
> Solr and Elasticsearch Consulting, Training and Production Support
>
> > On 30 Nov 2020, at 22:46, Jae Joo  wrote:
> >
> > Is "smart" really smarter than one explicitly defined?
> >
> > For "emun" type, would it be faster to define facet.method=enum than
> smart?
> >
> > Jae
>
>


Facet to part of search results

2020-12-03 Thread Jae Joo
Is there any way to apply facet to the partial search result?
For ex, we have 10m return by "dog" and like to apply facet to first 10K.
Possible?

Jae


facet.method=smart

2020-11-30 Thread Jae Joo
Is "smart" really smarter than one explicitly defined?

For "emun" type, would it be faster to define facet.method=enum than smart?

Jae


PositionGap

2020-10-09 Thread Jae Joo
Does increasing of Position Gap make Search Slow?

Jae


Multivalued field for Analysis on Admin page.

2020-10-09 Thread Jae Joo
I forgot how to enter multivalued in Analysis Page in Admin.
Can anyone help?

Jae


TimeAllowed and Partial Results

2020-09-22 Thread Jae Joo
I have timeAllowed=2000 (2sec) and having mostly 0 hits coming out. Should
I have more than 0 results?

Jae


Stop an async job submitted?

2020-09-18 Thread Jae Joo
HI,

Is there any way to stop the job running in Async mode?

Thanks,


Re: Multiple Collections in a Alias.

2020-08-12 Thread Jae Joo
I found it the root cause. I have 3 collections assigned to a alias and one
of them are NOT synched.
By the alias.











Collection 1











Collection 2











Collection 3











On Wed, Aug 12, 2020 at 7:29 PM Jae Joo  wrote:

> Good question. How can I validate if the replicas are all synched?
>
>
> On Wed, Aug 12, 2020 at 7:28 PM Jae Joo  wrote:
>
>> numFound  is same but different score.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> On Wed, Aug 12, 2020 at 6:01 PM Aroop Ganguly
>>  wrote:
>>
>>> Try a simple test of querying each collection 5 times in a row, if the
>>> numFound are different for a single collection within tase 5 calls then u
>>> have it.
>>> Please try it, what you may think is sync’d may actually not be. How do
>>> you validate correct sync ?
>>>
>>> > On Aug 12, 2020, at 10:55 AM, Jae Joo  wrote:
>>> >
>>> > The replications are all synched and there are no updates while I was
>>> > testing.
>>> >
>>> >
>>> > On Wed, Aug 12, 2020 at 1:49 PM Aroop Ganguly
>>> >  wrote:
>>> >
>>> >> Most likely you have 1 or more collections behind the alias that have
>>> >> replicas out of sync :)
>>> >>
>>> >> Try querying each collection to find the one out of sync.
>>> >>
>>> >>> On Aug 12, 2020, at 10:47 AM, Jae Joo  wrote:
>>> >>>
>>> >>> I have 10 collections in single alias and having different result
>>> sets
>>> >> for
>>> >>> every time with the same query.
>>> >>>
>>> >>> Is it as designed or do I miss something?
>>> >>>
>>> >>> The configuration and schema for all 10 collections are identical.
>>> >>> Thanks,
>>> >>>
>>> >>> Jae
>>> >>
>>> >>
>>>
>>>


Re: Multiple Collections in a Alias.

2020-08-12 Thread Jae Joo
Good question. How can I validate if the replicas are all synched?


On Wed, Aug 12, 2020 at 7:28 PM Jae Joo  wrote:

> numFound  is same but different score.
> 
> 
> 
> 
> 
> 
> 
>
> On Wed, Aug 12, 2020 at 6:01 PM Aroop Ganguly
>  wrote:
>
>> Try a simple test of querying each collection 5 times in a row, if the
>> numFound are different for a single collection within tase 5 calls then u
>> have it.
>> Please try it, what you may think is sync’d may actually not be. How do
>> you validate correct sync ?
>>
>> > On Aug 12, 2020, at 10:55 AM, Jae Joo  wrote:
>> >
>> > The replications are all synched and there are no updates while I was
>> > testing.
>> >
>> >
>> > On Wed, Aug 12, 2020 at 1:49 PM Aroop Ganguly
>> >  wrote:
>> >
>> >> Most likely you have 1 or more collections behind the alias that have
>> >> replicas out of sync :)
>> >>
>> >> Try querying each collection to find the one out of sync.
>> >>
>> >>> On Aug 12, 2020, at 10:47 AM, Jae Joo  wrote:
>> >>>
>> >>> I have 10 collections in single alias and having different result sets
>> >> for
>> >>> every time with the same query.
>> >>>
>> >>> Is it as designed or do I miss something?
>> >>>
>> >>> The configuration and schema for all 10 collections are identical.
>> >>> Thanks,
>> >>>
>> >>> Jae
>> >>
>> >>
>>
>>


Re: Multiple Collections in a Alias.

2020-08-12 Thread Jae Joo
numFound  is same but different score.








On Wed, Aug 12, 2020 at 6:01 PM Aroop Ganguly
 wrote:

> Try a simple test of querying each collection 5 times in a row, if the
> numFound are different for a single collection within tase 5 calls then u
> have it.
> Please try it, what you may think is sync’d may actually not be. How do
> you validate correct sync ?
>
> > On Aug 12, 2020, at 10:55 AM, Jae Joo  wrote:
> >
> > The replications are all synched and there are no updates while I was
> > testing.
> >
> >
> > On Wed, Aug 12, 2020 at 1:49 PM Aroop Ganguly
> >  wrote:
> >
> >> Most likely you have 1 or more collections behind the alias that have
> >> replicas out of sync :)
> >>
> >> Try querying each collection to find the one out of sync.
> >>
> >>> On Aug 12, 2020, at 10:47 AM, Jae Joo  wrote:
> >>>
> >>> I have 10 collections in single alias and having different result sets
> >> for
> >>> every time with the same query.
> >>>
> >>> Is it as designed or do I miss something?
> >>>
> >>> The configuration and schema for all 10 collections are identical.
> >>> Thanks,
> >>>
> >>> Jae
> >>
> >>
>
>


Re: Multiple Collections in a Alias.

2020-08-12 Thread Jae Joo
The replications are all synched and there are no updates while I was
testing.


On Wed, Aug 12, 2020 at 1:49 PM Aroop Ganguly
 wrote:

> Most likely you have 1 or more collections behind the alias that have
> replicas out of sync :)
>
> Try querying each collection to find the one out of sync.
>
> > On Aug 12, 2020, at 10:47 AM, Jae Joo  wrote:
> >
> > I have 10 collections in single alias and having different result sets
> for
> > every time with the same query.
> >
> > Is it as designed or do I miss something?
> >
> > The configuration and schema for all 10 collections are identical.
> > Thanks,
> >
> > Jae
>
>


Multiple Collections in a Alias.

2020-08-12 Thread Jae Joo
I have 10 collections in single alias and having different result sets for
every time with the same query.

Is it as designed or do I miss something?

The configuration and schema for all 10 collections are identical.
Thanks,

Jae


Distribution of Lead Replicas

2020-08-03 Thread Jae Joo
I have a cluster with 8 nodes for 24 shards with replicaFactor=3 and have
only 4 nodes have the leader replicas.
Is there any way to redistribute the lead nodes to cross of all 8 nodes?

Thanks,


Re: Null pointer exception in QueryComponent.MergeDds method

2020-07-07 Thread Jae Joo
Yes, we have timeAllowed=2 sec.


On Tue, Jul 7, 2020 at 2:20 PM Mikhail Khludnev  wrote:

> Still not clear regarding fl param. Does request enabled timeAllowed param?
> Anyway debugQuery true should give a clue why  "sort_values"  are absent in
> shard response, note they should be supplied at
> QueryComponent.doFieldSortValues(ResponseBuilder, SolrIndexSearcher).
>
> On Tue, Jul 7, 2020 at 4:19 PM Jae Joo  wrote:
>
> > 8.3.1
> >
> >   > required="true" multiValued="false" docValues="true"/>
> >   > required="true" multiValued="false"/>
> >
> > the field "id" is for nested document.
> >
> >
> >
> >
> > On Mon, Jul 6, 2020 at 4:17 PM Mikhail Khludnev  wrote:
> >
> > > Hi,
> > > What's the version? What's uniqueKey? is it stored? what's fl param?
> > >
> > > On Mon, Jul 6, 2020 at 5:12 PM Jae Joo  wrote:
> > >
> > > > I am seeing the nullPointerException in the list below and I am
> > > > looking for how to fix the exception.
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > NamedList sortFieldValues =
> > > > (NamedList)(srsp.getSolrResponse().getResponse().get("sort_values"));
> > > > if (sortFieldValues.size()==0 && // we bypass merging this response
> > > > only if it's partial itself
> > > > thisResponseIsPartial) { // but not the previous
> > > one!!
> > > >   continue; //fsv timeout yields empty sort_vlaues
> > > > }
> > > >
> > > >
> > > >
> > > > 2020-07-06 12:45:47.001 ERROR (qtp745962066-636182) [c:]]
> > > > o.a.s.h.RequestHandlerBase java.lang.NullPointerException
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:914)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:613)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:592)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:431)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198)
> > > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2576)
> > > > at
> > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799)
> > > > at
> > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
> > > > at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
> > > > at
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
> > > > at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
> > > > at
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > > > at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> > > > at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
> > > > at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
> > > > at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
> > > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: Null pointer exception in QueryComponent.MergeDds method

2020-07-07 Thread Jae Joo
8.3.1

 
 

the field "id" is for nested document.




On Mon, Jul 6, 2020 at 4:17 PM Mikhail Khludnev  wrote:

> Hi,
> What's the version? What's uniqueKey? is it stored? what's fl param?
>
> On Mon, Jul 6, 2020 at 5:12 PM Jae Joo  wrote:
>
> > I am seeing the nullPointerException in the list below and I am
> > looking for how to fix the exception.
> >
> > Thanks,
> >
> >
> > NamedList sortFieldValues =
> > (NamedList)(srsp.getSolrResponse().getResponse().get("sort_values"));
> > if (sortFieldValues.size()==0 && // we bypass merging this response
> > only if it's partial itself
> > thisResponseIsPartial) { // but not the previous
> one!!
> >   continue; //fsv timeout yields empty sort_vlaues
> > }
> >
> >
> >
> > 2020-07-06 12:45:47.001 ERROR (qtp745962066-636182) [c:]]
> > o.a.s.h.RequestHandlerBase java.lang.NullPointerException
> > at
> >
> >
> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:914)
> > at
> >
> >
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:613)
> > at
> >
> >
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:592)
> > at
> >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:431)
> > at
> >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2576)
> > at
> > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799)
> > at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578)
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
> > at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > at
> >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
> > at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Null pointer exception in QueryComponent.MergeDds method

2020-07-06 Thread Jae Joo
I am seeing the nullPointerException in the list below and I am
looking for how to fix the exception.

Thanks,


NamedList sortFieldValues =
(NamedList)(srsp.getSolrResponse().getResponse().get("sort_values"));
if (sortFieldValues.size()==0 && // we bypass merging this response
only if it's partial itself
thisResponseIsPartial) { // but not the previous one!!
  continue; //fsv timeout yields empty sort_vlaues
}



2020-07-06 12:45:47.001 ERROR (qtp745962066-636182) [c:]]
o.a.s.h.RequestHandlerBase java.lang.NullPointerException
at
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:914)
at
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:613)
at
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:592)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:431)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2576)
at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)


Re: Synonym Graph Filter - how to preserve original

2020-05-08 Thread Jae Joo
putting original term in the synonym list works.


On Fri, May 8, 2020 at 1:05 PM atin janki  wrote:

> Hi Jae,
>
> Do try to explain your problem with an example. Also share how you are
> writing the synonyms file.
> Best Regards,
> Atin Janki
>
>
> On Fri, May 8, 2020 at 6:14 PM Jae Joo  wrote:
>
> > In 8.3, There should be the way to preserve the original terms, but could
> > not find it.
> >
> > Does anyone know?
> >
> > Thanks,
> >
> > Jae
> >
>


Synonym Graph Filter - how to preserve original

2020-05-08 Thread Jae Joo
In 8.3, There should be the way to preserve the original terms, but could
not find it.

Does anyone know?

Thanks,

Jae


Nested Document with replicas slow

2020-04-13 Thread Jae Joo
I have multiple 100 M documents using Nested Document for joining. It is
the fastest way for joining in a single replica. By adding more replicas (2
or 3), the performance is slow down significantly. (about 100x times).
Does anyone have same experience?

Jae


Re: SolrCloud - Replica is showen as "Recovery-Failed"

2015-10-19 Thread Jae Joo
Found the root cause. I disabled the transaction log.

Thanks,

On Mon, Oct 19, 2015 at 1:07 PM, Jae Joo <jaejo...@gmail.com> wrote:

> Solr Version " 5.3
>
> I just built the SoleCloud with 5 shards and 3 replicationfactor in 15
> nodes. It means that I have shards and replicas running in it's own servers.
>
> When I see the Cloud page, I see that the status of replica is
> "recovery-failed".
> For testing, I downed the leader, but a replica couldn't be a leader
> because it's status was not active.
>
> NFO  - 2015-10-19 16:46:16.297;
> org.apache.solr.cloud.ShardLeaderElectionContext; My last published State
> was recovery_failed, I won't be the leader.
>
> There is no document indexed..
>
> Any help?
>
> Jae
>


SolrCloud - Replica is showen as "Recovery-Failed"

2015-10-19 Thread Jae Joo
Solr Version " 5.3

I just built the SoleCloud with 5 shards and 3 replicationfactor in 15
nodes. It means that I have shards and replicas running in it's own servers.

When I see the Cloud page, I see that the status of replica is
"recovery-failed".
For testing, I downed the leader, but a replica couldn't be a leader
because it's status was not active.

NFO  - 2015-10-19 16:46:16.297;
org.apache.solr.cloud.ShardLeaderElectionContext; My last published State
was recovery_failed, I won't be the leader.

There is no document indexed..

Any help?

Jae


Re: statsCache issue

2015-09-09 Thread Jae Joo
Thanks for your tip. Let me test in 5.3.



On Wed, Sep 9, 2015 at 4:23 PM, Markus Jelsma 
wrote:

> Hello - there are several issues with StatsCache < 5.3. If it  is loaded,
> it won't work reliably. We are using it properly on 5.3. Statistics may be
> a bit off if you are using BM25 though. You should upgrade to 5.3.
>
> Markus
>
> -Original message-
> > From:Jae Joo 
> > Sent: Wednesday 9th September 2015 21:23
> > To: solr-user@lucene.apache.org
> > Subject: statsCache issue
> >
> > Solr Version: 5.2.1
> >
> > Container: Tomcat (still).
> >
> > in SolrConfig.xml:
> >
> > 
> >
> >
> > However, I see the class is not plugged in.
> >
> > in log file:
> >
> > org.apache.solr.core.SolrCore; Using default statsCache cache:
> > org.apache.solr.search.stats.LocalStatsCache
> >
> >
> > Any reason why?
> >
> >
> > Thanks,
> >
> >
> > Jae
> >
>


statsCache issue

2015-09-09 Thread Jae Joo
Solr Version: 5.2.1

Container: Tomcat (still).

in SolrConfig.xml:




However, I see the class is not plugged in.

in log file:

org.apache.solr.core.SolrCore; Using default statsCache cache:
org.apache.solr.search.stats.LocalStatsCache


Any reason why?


Thanks,


Jae


PatternReplaceCharFilterfactor and Position

2015-07-14 Thread Jae Joo
I am having some issue regarding start and End position of token.
Here is the CharFilterFactory.

charFilter class=solr.PatternReplaceCharFilterFactory pattern=lt;/?
*ce(bold|sup|inf|hsp|vsp|italic)[^]* replacement=X/


Then the input data is

ce:sup loc=\post\1/ce:sup

In the Analysis page,
textraw_bytesstartendpositionLengthtypeposition
1[31]21311word1

Should the end position 22? It breaks the Highlighting...
HTMLStripCharFilterFactory is working properly

Any help?


Jae


WordDelimiterFilterFactory and PatternReplaceCharFilterFactory

2014-11-05 Thread Jae Joo
Hi,

Once I apply PatternReplaceCharFilterFactory to the input string, the
position of token is changed.
Here is an example.
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(lt;/?ce:italic[^]*) replacement=/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1
generateNumberParts=1
splitOnCaseChange=0
splitOnNumerics=0
catenateWords=1
catenateNumbers=0
catenateAll=0
preserveOriginal=1
/

In the analysis page,
ce:italicp/ce:italic-xylene and p-xylene (without xml tags) have
different positions.

for ce:italicp/ce:italic-xylene,
p-xylene -- 1
xylene -- 2
p -- 2
pxylene --

However, for the term (without tags) p-xylene,
p-xylene -- 1
p -- 1
xylene -- 2
pxylene -- 3

Only difference I can see is the start and end position because of xml tag.

Does any one know why?

Thanks,

Jae Joo


field specified edismax

2014-09-09 Thread Jae Joo
Any way to apply different edismax parameter to field by field?
For ex.
q=keywords:(lung cancer) AND title:chemotherapy

I would like to apply different qf for fields, keywords and title.
f.keywords.qf=keywords^40 subkeywords^20
f.title.qf=title^80 subtitle^20

I know it can be done by field aliasing, but doesn't like to use field
aliasing.

Thanks,

Jae


Synonym - multiple words and position

2014-08-27 Thread Jae Joo
In the synonym file,
antigravity, anti gravity

In the analysis, I see the position of anti is 1 and gravity is 2.
Is there any way to keep  postions of anti and gravity to 1?
And any ways to configure or define  to have synonym anti gravity rather
than anti and gravity for antigravity

Thanks,

Jae


Range query and Highlighting

2014-07-18 Thread Jae Joo
If I use a combined query - range query and others (term query), all terms
in field matched is highlighted. Any way to highlight only the term(s) in
term query?
Here is example.

+date:{20031231 TO *] +(title:red)

It highlight all terms except stopword.


using fq would not be an option because there may be multiple term queries
and boolean queries combined.


Any idea?


Jae


Synonyms - 20th and 20

2014-06-18 Thread Jae Joo
I have a synonyms.txt file which has
20th,twentieth

Once I apply the synonym, I see 20th, twentieth and 20 for 20th.
Does anyone know where 20 comes from? How can I have only 20th and
twentieth?

Thanks,

Jae


Retrieving Ranking (Position)

2011-03-17 Thread Jae Joo
Hi,

I am looking for the way to retrieve a ranking (or position) of  the
document matched  in the result set.

I can get the data, then parse it to find the position of the document
matched, but am looking for the way if there is a feature.

Thanks,

Jae


NRT in Solr

2011-03-08 Thread Jae Joo
Hi,
Is NRT in Solr 4.0 from trunk? I have checkouted from Trunk, but could not
find the configuration for NRT.

Regards

Jae


Solr Sharding and idf

2011-03-02 Thread Jae Joo
Is there still issue regarding distributed idf in sharding environment in
Solr 1.4 or 4.0?
If yes, any suggestions to resolve it?

Thanks,

Jae


Re: Solr Sharding and idf

2011-03-02 Thread Jae Joo
Yes, I knew that the ticket is still open. This is why I am looking for the
solutions now.

2011/3/2 Tomás Fernández Löbbe tomasflo...@gmail.com

 Hi Jae, this is the Jira created for the problem of IDF on distributed
 search:

 https://issues.apache.org/jira/browse/SOLR-1632

 It's still open

 On Wed, Mar 2, 2011 at 1:48 PM, Upayavira u...@odoko.co.uk wrote:

  As I understand it there is, and the best you can do is keep the same
  number of docs per shard, and keep your documents randomised across
  shards. That way you'll minimise the chances of suffering from
  distributed IDF issues.
 
  Upayavira
 
  On Wed, 02 Mar 2011 10:10 -0500, Jae Joo jaejo...@gmail.com wrote:
   Is there still issue regarding distributed idf in sharding environment
 in
   Solr 1.4 or 4.0?
   If yes, any suggestions to resolve it?
  
   Thanks,
  
   Jae
  
  ---
  Enterprise Search Consultant at Sourcesense UK,
  Making Sense of Open Source
 
 



Spatial search - Solr 4.0

2010-12-07 Thread Jae Joo
Hi,

I am implementing spatial search and found some odd things. As I know that
the returning distance is still being implemented, so I have implement
algorithm to calculate the actual distance based on lat and long returned.
when I do it, I have found the sort is not working properly. Any thing I
missed?

Jae


Sharding and Index Update

2010-01-07 Thread Jae Joo
All,

I have two indices - one has 23 M document and the other has less than 1000.
The small index is for real time update.

Does updating small index (with commit) hurt the overall performance?
(We can not update realtime for 23M big index because of heavy traffic and
size).

Thanks,

Jae Joo


solr.RemoveDuplicatesTokenFilterFactory

2009-12-22 Thread Jae Joo
Hi,

Here is the string to be indexed without duplication.

Kitchen Cabinet Utah Kitchen Remodeling Utah

Is RemoveDuplicatesTokenFilterFactory for this solution? or for something
else?

Jae


multi words synonyms

2009-08-19 Thread Jae Joo
Hi,

I would like to make the synonym for internal medicine to physician or
doctor. but it is not working properly. Anyone help me?

synonym.index.txt
internal medicine  = physician

synonyms.query.txt
physician, internal medicine  = physician, doctor

In the Analysis tool, I can see clearly that internal medicine is converted
to physician and doctor in index and querying times, but when actual query,
it is not converted (with debugQuery=true paprameter).

lst name=debug
str name=rawquerystringinternal medicine/str
str name=querystringinternal medicine/str
str name=parsedqueryjob:intern job:medicin/str
str name=parsedquery_toStringjob:intern job:medicin/str

It returns
doc
float name=score1.3963256/float
str name=job874878_INTERNATIONAL CONSULTANTS/str
/doc

Here is what I have in schema.xml
   analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.index.txt
ignoreCase=true expand=false/

   analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.index.txt
ignoreCase=true expand=false/


Solr 1.2 and 1.3 - different Stamming

2009-07-10 Thread Jae Joo
I have found that the stamming in solr 1.2 and 1.3 is different for
communication. We have index built in Solr 1.2 and the index is being
queried by 1.3. Is there any way to adjust it?

Jae joo


Joining Solr Indexes

2009-01-28 Thread Jae Joo
Hi,

Is there any way to join multiple indexes in Solr?

Thanks,

Jae


multiple indexes

2009-01-27 Thread Jae Joo
Hi,

I would like to know how it can be implemented.

Index1 has fields id,1,2,3 and index2 has fields id,5,6,7.
The ID in both indexes are unique id.

Can I use a kind of  distributed search and/or multicore to search, sort,
and facet through 2 indexes (index1 and index2)?

Thanks,

Jae joo


prefetching question

2009-01-13 Thread Jae Joo
Hi,

We do have 16 millions of company name and would like to find the way for
prefetching by using Solr.

Does anyone have experience and/or suggestions?

Thanks,

Jae Joo


spellCheckComponent and dismax query type

2008-12-23 Thread Jae Joo
I would like to use spell check with dismax, but it is not working. This
query searchs only default search field which is defined in schema.xml.

http://localhost:8080/ibegin_mb3/spellCheckCompRH?q=pluming%20heaingqt=dismaxspellcheck.q=pluming%20heaingspellcheck.count=10spellcheck=truespellcheck.collate=true

Can any one help me?

Thanks,

Jae Joo


DataImportHandler - time stamp format in

2008-12-05 Thread Jae Joo
In the dataimport.properties file, there is the timespamp.

#Thu Dec 04 15:36:22 EST 2008
last_index_time=2008-12-04 15\:36\:20

I am using the Oracle (10g) and would like to know which format of timestamp
I have to use in Oracle.

Thanks,

Jae


Re: Solr on Solaris

2008-12-05 Thread Jae Joo
I do have same experience.
What is the CPU in the Solaris box? it is not depending on the operating
system (linux or Solaris). It is depenong on the CPU (Intel ro SPARC).
Don't know why, but based on my performance test, SPARC machine requires
MORE memory for java application.

Jae

On Thu, Dec 4, 2008 at 10:40 PM, Kashyap, Raghu [EMAIL PROTECTED]wrote:

 We are running solr on a solaris box with 4 CPU's(8 cores) and  3GB Ram.
 When we try to index sometimes the HTTP Connection just hangs and the
 client which is posting documents to solr doesn't get any response back.
 We since then have added timeouts to our http requests from the clients.



 I then get this error.



 java.lang.OutOfMemoryError: requested 239848 bytes for Chunk::new. Out
 of swap space?

 java.lang.OutOfMemoryError: unable to create new native thread

 Exception in thread JmxRmiRegistryConnectionPoller
 java.lang.OutOfMemoryError: unable to create new native thread



 We are running JDK 1.6_10 on the solaris box. . The weird thing is we
 are running the same application on linux box with JDK 1.6 and we
 haven't seen any problem like this.



 Any suggestions?



 -Raghu




DataImport Hadnler - new bee question

2008-12-02 Thread Jae Joo
Hey,

I am trying to connect the Oracle database and index the values into solr,
but I ma getting the
Document [null] missing required field: id.

Here is the debug output.
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched2/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-12-02 13:49:35/str
−
str name=
Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
/str

schema.xml
field name=id type=string indexed=true stored=true required=true
/
   field name=subject type=text indexed=true stored=true
omitNorms=true/

 /fields
 uniqueKeyid/uniqueKey


data-config.xml

dataConfig
dataSource  driver=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@x.x.x.x: user=...  password=.../
document name=companyQAIndex
entity name=companyqa  pk=id query=select * from solr_test 
field column=id name=id /
field column=text name=subject /

/entity
/document
/dataConfig

Database Schema
id  is the pk.
There are only 2 rows in the table solr_test.

Will anyone help me what I am wrong?

Jae


Re: DataImport Hadnler - new bee question

2008-12-02 Thread Jae Joo
I actually found the problem. Oracle returns the field name as Capital.

On Tue, Dec 2, 2008 at 1:57 PM, Jae Joo [EMAIL PROTECTED] wrote:

 Hey,

 I am trying to connect the Oracle database and index the values into solr,
 but I ma getting the
 Document [null] missing required field: id.

 Here is the debug output.
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched2/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2008-12-02 13:49:35/str
 −
 str name=
 Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
 /str

 schema.xml
 field name=id type=string indexed=true stored=true required=true
 /
field name=subject type=text indexed=true stored=true
 omitNorms=true/

  /fields
  uniqueKeyid/uniqueKey


 data-config.xml

 dataConfig
 dataSource  driver=oracle.jdbc.driver.OracleDriver
 url=jdbc:oracle:thin:@x.x.x.x: user=...  password=.../
 document name=companyQAIndex
 entity name=companyqa  pk=id query=select * from solr_test 
 field column=id name=id /
 field column=text name=subject /

 /entity
 /document
 /dataConfig

 Database Schema
 id  is the pk.
 There are only 2 rows in the table solr_test.

 Will anyone help me what I am wrong?

 Jae




Facet Query and Query

2008-11-25 Thread Jae Joo

 I am having some trouble to utilize the facet Query. As I know that the
 facet Query has better performance that simple query (q).
 Here is the example.


 http://localhost:8080/test_solr/select?q=*:*facet=truefq=state:CAfacet.mincount=1facet.field=cityfacet.field=sectorfacet.limit=-1sort=score+desc

 -- facet by sector and city for state of CA.
 Any idea how to optimize this query to avoid q=*:*?

 Thanks,

 Jae





Facet Query (fq) and Query (q)

2008-11-24 Thread Jae Joo
I am having some trouble to utilize the facet Query. As I know that the
facet Query has better performance that simple query (q).
Here is the example.

http://localhost:8080/test_solr/select?q=*:*facet=truefq=state:CAfacet.mincount=1facet.field=cityfacet.field=sectorfacet.limit=-1sort=score+desc

-- facet by sector and city for state of CA.
Any idea how to optimize this query to avoid q=*:*?

Thanks,

Jae


Re: Out of Memory Errors

2008-10-22 Thread Jae Joo
Here is what I am doing to check the memory statues.
1. Run the Servelt and Solr application.
2. On command prompt, jstat -gc pid 5s (5s means that getting data every 5
seconds.)
3. Watch it or pipe to the file.
4. Analyze the data gathered.

Jae

On Tue, Oct 21, 2008 at 9:48 PM, Willie Wong [EMAIL PROTECTED]wrote:

 Hello,

 I've been having issues with out of memory errors on searches in Solr. I
 was wondering if I'm hitting a limit with solr or if I've configured
 something seriously wrong.

 Solr Setup
 - 3 cores
 - 3163615 documents each
 - 10 GB size
 - approx 10 fields
 - document sizes vary from a few kb to a few MB
 - no faceting is used however the search query can be fairly complex with
 8 or more fields being searched on at once

 Environment:
 - windows 2003
 - 2.8 GHz zeon processor
 - 1.5 GB memory assigned to solr
 - Jetty 6 server

 Once we get to around a few  concurrent users OOM start occuring and Jetty
 restarts.  Would this just be a case of more memory or are there certain
 configuration settings that need to be set?  We're using an out of the box
 Solr 1.3 beta version.

 A few of the things we considered that might help:
 - Removing sorts on the result sets (result sets are approx 40,000 +
 documents)
 - Reducing cache sizes such as the queryResultMaxDocsCached setting,
 document cache, queryResultCache, filterCache, etc

 Am I missing anything else that should be looked at, or is it time to
 simply increase the memory/start looking at distributing the indexes?  Any
 help would be much appreciated.


 Regards,

 WW



Re: Multi core weight

2008-05-16 Thread Jae Joo
Running multiple indivisual queries is one option, but because of the volume
of documents (14 millions) and traffic, 10 request per second, I am looking
the optimal way to do that.

Thanks,

Jae

On Thu, May 15, 2008 at 11:57 AM, Otis Gospodnetic 
[EMAIL PROTECTED] wrote:

 Jae,
 It sounds like you are doing a distributed search across your 3 cores on a
 single Solr instance?  Why not do run 3 individual queries (parallel or
 serial, your choice) and pick however many hits you need from each result?

  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


 - Original Message 
  From: Jae Joo [EMAIL PROTECTED]
  To: solr-user@lucene.apache.org
  Sent: Thursday, May 15, 2008 9:07:54 AM
  Subject: Multi core weight
 
  Hi,
 
  I am looking for the best or possible way to add WEIGHT to each core in
  multi core environment.
 
  core 1 has about 10 millions articles from same publisher and core 2 and
 3
  have less than 10k.
  I would like to have BALANCED Query result - ex. 10 from core 1, 10 from
  core 2 and 10 from core 3..
 
  Thanks,
 
  Jae




Multi core weight

2008-05-15 Thread Jae Joo
Hi,

I am looking for the best or possible way to add WEIGHT to each core in
multi core environment.

core 1 has about 10 millions articles from same publisher and core 2 and 3
have less than 10k.
I would like to have BALANCED Query result - ex. 10 from core 1, 10 from
core 2 and 10 from core 3..

Thanks,

Jae


RE: how to suppress result

2008-04-07 Thread Jae Joo

I do have a same situation. Got 30 million indexed and deleted 3
millions.

DELETE can not be posted as same way as ADD. We can add multiple
documents in the file, but not for DELETE.


If there is RANGE of ID, make the range first then delete the record
in index by 
{URL} deletequeryid:[xxx TO yyy]/query/delete.


This xml should be sent to Solr engine ONE BY ONE. 

And I am trying to delete index directly through Lucene API. It may
work, but did not try yet.

Thanks,

Jae

-Original Message-
From: Evgeniy Strokin [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 07, 2008 11:56 AM
To: Solr User
Subject: how to suppress result

Hello,.. I have odd problem.
I use Solr for regular search, and it works fine for my task, but my
client has a list of IDs in a flat separate file (he could have huge
amount of the IDs, up to 1M) and he wants to exclude those IDs from
result of the search.
What is the right way to do this?

Any thoughts are greatly appreciated.
Thank you
Gene



RE: indexing slow, IO-bound?

2008-04-05 Thread Jae Joo

You can adjust the performance of indexing by configuring of these parameters.

mainIndex
!-- lucene options specific to the main on-disk lucene index --
useCompoundFilefalse/useCompoundFile
mergeFactor10/mergeFactor
maxBufferedDocs1000/maxBufferedDocs
maxMergeDocs2147483647/maxMergeDocs
maxFieldLength1/maxFieldLength
  /mainIndex


Jae

-Original Message-
From: Britske [mailto:[EMAIL PROTECTED]
Sent: Sat 4/5/2008 10:09 AM
To: solr-user@lucene.apache.org
Subject: indexing slow, IO-bound?
 

Hi, 

I have a schema with a lot of (about 1) non-stored indexed fields, which
I use for sorting. (no really, that is needed). Moreover I have about 30
stored fields. 

Indexing of these documents takes a long time. Because of the size of the
documents (because of the indexed fields) I am currently batching 50
documents at once which takes about 2 seconds.Without adding the 1
indexed fields to the document, indexing flies at about 15 ms for these 50
documents. INdexing is done using SolrJ

This is on a intel core 2 6400 @2.13ghz and 2 gb ram. 

To speed this up I let 2 threads do the indexing in parallel. What happens
is that solr just takes double the time (about 4 seconds) to complete these
two jobs of 50 docs each in parallel. I figured because of the multi-core
setup indexing should improve, which it doesn't. 

Does this perhaps indicate that the setup is IO-bound? What would be your
best guess  (given the fact that the schema has a big amount of indexed
fields) to try next to improve indexing performance? 

Geert-Jan
-- 
View this message in context: 
http://www.nabble.com/indexing-slow%2C-IO-bound--tp16513196p16513196.html
Sent from the Solr - User mailing list archive at Nabble.com.




sort by index id descending?

2008-03-18 Thread Jae Joo
Is there any way to sort by index id - descending? (by order of indexed)

Thanks,
Jae


Re: sort by index id descending?

2008-03-18 Thread Jae Joo
Finding the way how to sort by internal_docid desc.

Thanks,
Jae

On Tue, Mar 18, 2008 at 11:41 AM, Jae Joo [EMAIL PROTECTED] wrote:

 Is there any way to sort by index id - descending? (by order of indexed)

 Thanks,
 Jae



sort by uniq fields

2008-03-17 Thread Jae Joo
I have 30 millions document indexed and tried sort by sequenceid which is
unique over the document.
I am experiencing very slow than sort by pub_date.
sequenceid is not defined as unique key in the schema.xml and there is the
unique key defined in schema.xml - item_id.

Anyone knows why?

Thanks,

Jae


Delete document

2008-03-04 Thread Jae Joo
Hi,

I have many document to be deleted and the xml file is built shown as below.

delete.xml

deletequeryid:0286-14582373/query/delete
deletequeryid:0286-14582372/query/delete
deletequeryid:0286-14582371/query/delete
deletequeryid:0286-14582415/query/delete
deletequeryid:0286-14582414/query/delete
deletequeryid:0286-14582413/query/delete
deletequeryid:0286-14582412/query/delete
deletequeryid:0286-14582411/query/delete
deletequeryid:0286-14582410/query/delete
deletequeryid:0286-14582409/query/delete
deletequeryid:0286-14582408/query/delete



Once I post it using post.sh command (post.sh delete.xml), the only first
document is deleted.

Did I miss something?

Thanks,

Jae


RE: out of memory every time

2008-03-03 Thread Jae Joo

While the job is running, you can monitor the memory usage.

Use the following command - jstat (you can find in the java/bin directory)

jstat -gc PID 5s -- every 5 seconds.

Jae

-Original Message-
From: Reece [mailto:[EMAIL PROTECTED]
Sent: Mon 3/3/2008 8:20 PM
To: solr-user@lucene.apache.org
Subject: Re: out of memory every time
 
Just guessing, but I'd say it has something to do with the dynamic fields...

I ran a similar operation (docs ranged from 1K to 2MB).  For the
initial indexing, I wrote a job to submit about 100,000 documents to
solr, committing after every 10 docs.  I never sent any optimize
commands.  I also used the example start.jar and didn't specify any
memory constraints.

My job ran for 3 days, and finished without any errors or memory problems.

The only difference I see is that I didn't use any dynamic fields, and
I only stored 2 fields instead of them all.

Just my $0.02
-Reece



On Mon, Mar 3, 2008 at 6:15 PM, Thorsten Scherler [EMAIL PROTECTED] wrote:
 On Mon, 2008-03-03 at 21:43 +0200, Justin wrote:
   I'm indexing a large number of documents.
  
   As a server I'm using the /solr/example/start.jar
  
   No matter how much memory I allocate it fails around 7200 documents.

  How do you allocate the memory?

  Something like:
  java -Xms512M -Xmx1500M -jar start.jar

  You may have a closer look as well at
  http://java.sun.com/j2se/1.5.0/docs/guide/vm/gc-ergonomics.html

  HTH

  salu2



   I am committing every 100 docs, and optimizing every 300.
  
   all of my xml's contain on doc, and can range in size from 2k to 700k.
  
   when I restart the start.jar it again reports out of memory.
  
  
   a sample document looks like this:
   ?xml version=1.0 encoding=UTF-8?
   add
doc
 field name=PK1851/field
 field name=ft:genes.Symbol:1851TRAJ20/field
 field name=ft:external_ids.SourceAccession:1553112049/field
 field
   name=ft:external_ids.SourceAccession:15532ENSG0211869/field
 field name=ft:external_ids.SourceAccession:1553328735/field
 field name=ft:external_ids.SourceAccession:15534HUgn28735/field
 field name=ft:external_ids.SourceAccession:15535TRA_/field
 field name=ft:external_ids.SourceAccession:15536TRAJ20/field
 field name=ft:external_ids.SourceAccession:155379953837/field
 field
   name=ft:external_ids.SourceAccession:15538ENSG0211869/field
 field name=ft:aliases_and_descriptions.Value:9775T cell receptor 
 alpha
   joining 20/field
 field name=ft:cytogenetic_locations.Cytoband:490914q11.2/field
 field name=ft:cytogenetic_locations.Cytoband:491014q11/field
 field name=ft:cytogenetic_locations.Cytoband:491114q11.2/field
 field name=ft:location_extras.ContigRefseq:11806AE000662.1/field
 field name=ft:location_extras.ContigRefseq:11807M94081.1/field
 field name=ft:location_extras.ContigRefseq:11808CH471078.2/field
 field name=ft:location_extras.ContigRefseq:11809NC_14.7/field
 field name=ft:location_extras.ContigRefseq:11810NT_026437.11/field
 field name=ft:location_extras.ContigRefseq:11811NG_001332.2/field
 field name=ft:articles.SourceAccession:1927678188290/field
 field name=ft:articles.Title:192767The human T-cell receptor
   TCRAC/TCRDC (C alpha/C delta) region: organization,sequence, and evolution
   of 97.6 kb of DNA./field
 field name=ft:authors.AuthorName:5909Koop B.F./field
 field name=ft:authors.AuthorName:6912Rowen L./field
 field name=ft:authors.AuthorName:6985Hood L./field
 field name=ft:authors.AuthorName:17109Wang K./field
 field name=ft:authors.AuthorName:72700Kuo C.L./field
 field name=ft:authors.AuthorName:84285Seto D./field
 field name=ft:authors.AuthorName:166156Lenstra J.A./field
 field name=ft:authors.AuthorName:216734Howard S./field
 field name=ft:authors.AuthorName:285493Shan W./field
 field name=ft:authors.AuthorName:346559Deshpande P./field
 field name=ft:probesets.Name:677331311_at/field
 field name=ft:probesets.BinaryPattern:6773/field
   /doc
   /add
  
  
   the schema is (in summary):
   fields
  field name=PK type=sint indexed=true stored=true 
 required=true
   multiValued=false omitNorms=true/
  field name=text type=text indexed=true stored=false
   multiValued=true  omitNorms=true/
  
  dynamicField name=ft:*  type=stringindexed=true
   stored=true  omitNorms=true/
  dynamicField name=st:*  type=string  indexed=true  stored=true
   omitNorms=true/
   /fields
  
  
   uniqueKeyPK/uniqueKey
   defaultSearchFieldtext/defaultSearchField
   solrQueryParser defaultOperator=OR/
  
   copyField source=ft:* dest=text/
   copyField source=st:* dest=text/
  
  
   and my conf is:
  useCompoundFilefalse/useCompoundFile
   mergeFactor100/mergeFactor
   maxBufferedDocs900/maxBufferedDocs
   maxMergeDocs2147483647/maxMergeDocs
   maxFieldLength1/maxFieldLength
  --
  Thorsten Scherler 

RE: Tomcat(8080) - Solr(80) port setup confusion??

2008-02-28 Thread Jae Joo
As I know that tomcat is the server to support servlet and jsp and solr is just 
one of application of tomcat.

So, theer is no meaning of port# of Solr.

Thanks

 

Jae 


Hi All,

I have installed solr through tomcat(5.5.23). Its up and running on port
8080. Its like, if tomcat is running, solr is running and vice versa. I need
tomcat on 8080 and solr on 80 port...Is this possible? Do I need to make
changes in the server.xml of tomcat/conf...Is there any way to do this?

Please let me know if anybody have clues regarding the same...

Thanks in advance...


--
View this message in context: 
http://www.nabble.com/Tomcat%288080%29---Solr%2880%29-port-setup-confusion---tp15736978p15736978.html
Sent from the Solr - User mailing list archive at Nabble.com.





RE: Shared index base

2008-02-26 Thread Jae Joo

In my environment, there is NO big difference between local disk and SAN based 
file system.
A little slow down, but not a problem (1 or 2 %)
I do have 4 sets of solr indices each has more than 10G in 3 servers. 
I think that it is not good way to share SINGLE Index. - disk is pretty cheap 
and we can add more disk in SAN pretty easily. 
I have another server which is called Master with local disk based Solr Index 
to update the index.
By some accident or time out, the update is not done successfully, so I do need 
to do something by manually.
If you have only one index, there is a risk to mess up the index.

Thanks,

Jae


-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED]
Sent: Tue 2/26/2008 1:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Shared index base
 
I saw a 100X slowdown running with indexes on NFS.

I don't understand going through a lot of effort with unsupported
configurations just to share an index. Local disk is cheap, the
snapshot stuff works well, and local discs avoid a single point
of failure.

The testing time to make a shared index work with each new
release of Solr is almost certainly more expensive than buying
local disc.

The single point of failure is real issue. I've seen two discs
fail on one RAID. When that happens, you've lost all of your
search for hours or days.

Finally, how do you tell Solr that the index has changed and
it needs a new Searcher? Normally, that is a commit, but you
don't want to commit from a read-only Solr.

wunder

On 2/26/08 10:17 AM, Matthew Runo [EMAIL PROTECTED] wrote:

 I hope so. I've found that every once in a while Solr 1.2 replication
 will die, from a temp-index file that seems to ham it up. Removing
 that file on all the servers fixes the issue though.
 
 We'd like to be able to point all the servers at an NFS location for
 their index files, and use a single server to update it.
 
 Thanks!
 
 Matthew Runo
 Software Developer
 Zappos.com
 702.943.7833
 
 On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:
 
 Are you saying all the servers will use the same 'data' dir?  Is
 that a supported config?
 
 On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:
 
 We're about to do the same thing here, but have not tried yet. We
 currently run Solr with replication across several servers. So long
 as only one server is doing updates to the index, I think it should
 work fine.
 
 
 Thanks!
 
 Matthew Runo
 Software Developer
 Zappos.com
 702.943.7833
 
 On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:
 
 I know there was such discussions about the subject, but I want to
 ask again if somebody could share more information.
 We are planning to have several separate servers for our search
 engine. One of them will be index/search server, and all others
 are search only.
 We want to use SAN (BTW: should we consider something else?) and
 give access to it from all servers. So all servers will use the
 same index base, without any replication, same files.
 Is this a good practice? Did somebody do the same? Any problems
 noticed? Or any suggestions, even about different configurations
 are highly appreciated.
 
 Thanks,
 Gene
 
 
 






RE: Commit preformance problem

2008-02-12 Thread Jae Joo
Or, if you have multiple files to be updated, please make sure Index
Multiple Files and commit Once at the end of Indexing..

Jae

-Original Message-
From: Jae Joo [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 12, 2008 10:50 AM
To: solr-user@lucene.apache.org
Subject: RE: Commit preformance problem

I have same experience.. I do have 6.5G Index and update it daily.
Have you ever check that the updated file does not have any document and
tried commit? I don't know why, but it takes so long - more than 10
minutes.

Jae Joo

-Original Message-
From: Ken Krugler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 12, 2008 10:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Commit preformance problem

I have a large solr index that is currently about 6 GB and is suffering
of
severe performance problems during updates. A commit can take over 10
minutes to complete. I have tried to increase max memory to the JVM to
over
6 GB, but without any improvement. I have also tried to turn off
waitSearcher and waitFlush, which do significantly improve the commit
speed.
However, the max number of searchers is then quickly reached.

If you have a large index, then I'd recommend having a separate Solr 
installation that you use to update/commit changes, after which you 
use snappuller or equivalent to swap it in to the live (search) 
system.

Would a switch to another container (currently using Jetty) make any
difference?

Very unlikely.

Does anyone have any other tip for improving the performance?

Switch to Lucene 2.3, and tune the new parameters that control memory 
usage during updating.

-- Ken
-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
If you can't find it, you can't fix it


RE: Commit preformance problem

2008-02-12 Thread Jae Joo
I have same experience.. I do have 6.5G Index and update it daily.
Have you ever check that the updated file does not have any document and
tried commit? I don't know why, but it takes so long - more than 10
minutes.

Jae Joo

-Original Message-
From: Ken Krugler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 12, 2008 10:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Commit preformance problem

I have a large solr index that is currently about 6 GB and is suffering
of
severe performance problems during updates. A commit can take over 10
minutes to complete. I have tried to increase max memory to the JVM to
over
6 GB, but without any improvement. I have also tried to turn off
waitSearcher and waitFlush, which do significantly improve the commit
speed.
However, the max number of searchers is then quickly reached.

If you have a large index, then I'd recommend having a separate Solr 
installation that you use to update/commit changes, after which you 
use snappuller or equivalent to swap it in to the live (search) 
system.

Would a switch to another container (currently using Jetty) make any
difference?

Very unlikely.

Does anyone have any other tip for improving the performance?

Switch to Lucene 2.3, and tune the new parameters that control memory 
usage during updating.

-- Ken
-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
If you can't find it, you can't fix it


RE: Multiple Search in Solr

2008-02-04 Thread Jae Joo
I have downloaded version 1.3 and built multiple indices.

I could not find any way for multiple indices search at Solr level, I
have written the Lucene application. It is working well.

Jae Joo

-Original Message-
From: Niveen Nagy [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 04, 2008 8:55 AM
To: solr-user@lucene.apache.org
Subject: Multiple Search in Solr

Hello ,

 

I have a question concerning solr multiple indices. We have 4 solr
indices in our system and we want to use distributed search (Multiple
search) that searches in the four indices in parallel. We downloaded the
latest code from svn and we applied the patch distributed.patch but we
need more detailed description on how to use this patch and what changes
should be applied to solr schema, and how these indices should be
located. Another question here is could the steps be applied to our
indices that was built using a version before applying the distributed
patch.

 

 Thanks in advance.

   

Best Regards,

 

Niveen Nagy

 



auto Warming and Special Character

2008-01-22 Thread Jae Joo
In the firstsearch listner, I need to use special character  in the q
string, but it complains Error - filterStart


listener event=firstSearcher class=solr.QuerySenderListener
  arr name=queries
lst
str name=qcompany_desc:Advertising  Marketing/str
str name=start0/str
str name=rows20/str
str name=flcompany_name, score/str
/lst
   /arr
/listener

Thanks,

Jae Joo


Tomcat and JBOss

2008-01-09 Thread Jae Joo
I have a problem - memory and performance issues for more than 10 request
(solr Search and Facet) per second.
On tomcat, it requires 4 to 5 G Bytes, but still not enough.
Does anyone have any experience regarding high volume and performance issue
on Tomcat and JBOss and resolutions share with me?

Thanks,

Jae


Solr Multicore

2008-01-08 Thread Jae Joo
I have set multicores - core0 and core1, core0 is default.

multicore adminPath=/admin/multicore persistent=true 
  core name=core0 instanceDir=core0 default=true/
  core name=core1 instanceDir=core1 /
/multicore

Once I update the index by http://localhost:8983/solr/update, it updates
core1 not core0.

Also, I tried to set the deault core using SETASDEFAULT, but it is unknown
action command.

Can any one help me?

Thanks,

Jae


Multicore request

2008-01-08 Thread Jae Joo
I have built two cores - core0 and core1.
each core has different set of index.

I can access core0 and core 1 by
http://localhost:8983/solr/core[01]/admin/form.jsp.

Is there any way to access multiple indexes with single query?

Thanks,

Jae


Tomcat and Solr - out of memory

2008-01-07 Thread Jae Joo
Hi,

What happens if Solr application hit the max. memory of heap assigned?

Will be die or just slow down?

Jae


Re: Duplicated Keyword

2008-01-04 Thread Jae Joo
title of Document 1 - This is document 1 regarding china - fieldtype =
text
title of Document 2 - This is document 2 regarding china  fieldtype=text

Once it is indexed, will index hold  2 china  text fields  or just 1 china
word which is pointing document1 and document2?

Jae

On Jan 4, 2008 10:54 AM, Robert Young [EMAIL PROTECTED] wrote:

 I don't quite understand what you're getting at. What is the problem
 you're encountering or what are you trying to achieve?

 Cheers
 Rob

 On Jan 4, 2008 3:26 PM, Jae Joo [EMAIL PROTECTED] wrote:
  Hi,
 
  Is there any way to dedup the keyword cross the document?
 
  Ex.
 
  china keyword is in doc1 and doc2. Will Solr index have only 1 china
  keyword for both document?
 
  Thanks,
 
  Jae Joo
 



Re: Issues with postOptimize

2007-12-19 Thread Jae Joo
try it.
listener event=postOptimize class=solr.RunExecutableListener
 str
name=exe/search/replication_test/0/index/solr/bin/snapshooter/str
 str name=dir./str
 bool name=waittrue/bool
/listener

Jae

On Dec 19, 2007 9:10 AM, Bill Au [EMAIL PROTECTED] wrote:

 Just changing the permission on the script is not enough.  The id
 executing
 the script needs to have write permission to create the snapshot.

 Bill

 On Dec 18, 2007 6:26 PM, Sunny Bassan [EMAIL PROTECTED] wrote:

  I've set the permissions on the script to execute for all users. And it
  does seem like the user who is running SOLR has the permissions to run
  the script. I've come to the conclusion - Linux permissions are
  annoying, lol. I've also tried setting selinux to permissive mode and
  added the user to the sudoers file, but this has not fixed the issue.
  The only thing that does work is croning the script to run after the
  optimize script.
 
  Sunny
 



Max. number of Error messages

2007-12-18 Thread Jae Joo
Is there any parameter to set the max. number of error messages..
The Solr system was killed after a couple of error messages which caused by
WRONG QUERY

Thanks,

Jae


Local Disk and SAN

2007-11-30 Thread Jae Joo
Hi,

I have about 20G bytes of index with 1 Million transactions per day.
I am considering the disk system between local disk and SAN based system
(not NFS).
Is there any performance difference to run solr instance with 20 G index on
local disk and on SAN based disk which is connected with fiber channel?

Thanks,

Jae


score customization

2007-11-15 Thread Jae Joo
Hi,

I am looking for the way to get the score - only hundredth - ex.
4.09something like that.
Currently, it has 7 decimal digits. float name=score1.8032384/float

Thanks,

Jae


snappuller rsync parameter error? - solr hardcoded

2007-11-14 Thread Jae Joo
In the snappuller, the solr is hardcoded. Should it be
${master_data_dir}?

# rsync over files that have changed
rsync -Wa${verbose}${compress} --delete ${sizeonly} \
${stats} rsync://${master_host}:${rsyncd_port}/solr/${name}/
${data_dir}/${name}-wip

Thanks,

Jae


Solr/bin script - Solaris bash version?

2007-11-13 Thread Jae Joo
Hi,

Is there Solaris bash based script available? The couple of command is not
working, and wondering any available scripts I can use before I update it.

For ex. snapshooter, snappuller, snapinstaller



Thanks,

Jae


snapshot files

2007-11-13 Thread Jae Joo
Hi,

I have successfully built generated the snalshot files but have a question.
Does each snapshot file has all files in the index directory?

Here is the file list in the index
_0.fdt_0.fnm_0.nrm_0.tii_1.fdt
_1.fnm_1.nrm_1.tiisegments.gen
_0.fdx_0.frq_0.prx_0.tis_1.fdx
_1.frq_1.prx_1.tissegments_3

And here is the file list of 2 snapshot files.

 snapshot.20071113094936
_0.fdt_0.fdx_0.fnm_0.frq_0.nrm
_0.prx_0.tii_0.tissegments.gen  segments_2


 snapshot.20071113095508
_0.fdt_0.fnm_0.nrm_0.tii_1.fdt
_1.fnm_1.nrm_1.tiisegments.gen
_0.fdx_0.frq_0.prx_0.tis_1.fdx
_1.frq_1.prx_1.tissegments_3

The later one have all files same as index directory.

I have changed the snapshooter script because the bash in solaris do not
have cp -l option.

#cp -lr ${data_dir}/index ${temp} -- original
mkdir ${temp}
ln  ${data_dir}/index/* ${temp}



Thanks,

Jae Joo


Query and heap Size

2007-11-12 Thread Jae Joo
In my system, the heap size (old generation) keeps growing up caused by
heavy traffic.
I have adjusted the size of young generation, but it does not work well.

Does anyone have any recommendation regarding this issue? - Solr
configuration and/or web.xml ...etc...

Thanks,

Jae


Re: Multiple indexes

2007-11-12 Thread Jae Joo
Here is my situation.

I have 6 millions articles indexed and adding about 10k articles everyday.
If I maintain only one index, whenever the daily feeding is running, it
consumes the heap area and causes FGC.
I am thinking the way to have multiple indexes - one is for ongoing querying
service and one is for update. Once update is done, switch the index by
automatically and/or my application.

Thanks,

Jae joo


On Nov 12, 2007 8:48 AM, Ryan McKinley [EMAIL PROTECTED] wrote:

 The advantages of a multi-core setup are configuration flexibility and
 dynamically changing available options (without a full restart).

 For high-performance production solr servers, I don't think there is
 much reason for it.  You may want to split the two indexes on to two
 machines.  You may want to run each index in a separate JVM (so if one
 crashes, the other does not)

 Maintaining 2 indexes is pretty easy, if that was a larger number or you
 need to create indexes for each user in a system then it would be worth
 investigating the multi-core setup (it is still in development)

 ryan


 Pierre-Yves LANDRON wrote:
  Hello,
 
  Until now, i've used two instance of solr, one for each of my
 collections ; it works fine, but i wonder
  if there is an advantage to use multiple indexes in one instance over
 several instances with one index each ?
  Note that the two indexes have different schema.xml.
 
  Thanks.
  PL
 
  Date: Thu, 8 Nov 2007 18:05:43 -0500
  From: [EMAIL PROTECTED]
  To: solr-user@lucene.apache.org
  Subject: Multiple indexes
 
  Hi,
 
  I am looking for the way to utilize the multiple indexes for signle
 sole
  instance.
  I saw that there is the patch 215  available  and would like to ask
 someone
  who knows how to use multiple indexes.
 
  Thanks,
 
  Jae Joo
 
  _
  Discover the new Windows Vista
  http://search.msn.com/results.aspx?q=windows+vistamkt=en-USform=QBRE




Re: Multiple indexes

2007-11-12 Thread Jae Joo
I have built the master solr instance and indexed some files. Once I run
snapshotter, i complains the error..  - snapshooter -d data/index (in
solr/bin directory)
Did I missed something?

++ date '+%Y/%m/%d %H:%M:%S'
+ echo 2007/11/12 12:38:40 taking snapshot
/solr/master/solr/data/index/snapshot.20071112123840
+ [[ -n '' ]]
+ mv 
/solr/master/solr/data/index/temp-snapshot.20071112123840/solr/master/solr/data/index/snapshot.20071112123840
mv: cannot access /solr/master/solr/data/index/temp-snapshot.20071112123840
Jae

On Nov 12, 2007 9:09 AM, Ryan McKinley [EMAIL PROTECTED] wrote:


 just use the standard collection distribution stuff.  That is what it is
 made for! http://wiki.apache.org/solr/CollectionDistribution

 Alternatively, open up two indexes using the same config/dir -- do your
 indexing on one and the searching on the other.  when indexing is done
 (or finishes a big chunk) send commit/ to the 'searching' one and it
 will see the new stuff.

 ryan



 Jae Joo wrote:
  Here is my situation.
 
  I have 6 millions articles indexed and adding about 10k articles
 everyday.
  If I maintain only one index, whenever the daily feeding is running, it
  consumes the heap area and causes FGC.
  I am thinking the way to have multiple indexes - one is for ongoing
 querying
  service and one is for update. Once update is done, switch the index by
  automatically and/or my application.
 
  Thanks,
 
  Jae joo
 
 
  On Nov 12, 2007 8:48 AM, Ryan McKinley [EMAIL PROTECTED] wrote:
 
  The advantages of a multi-core setup are configuration flexibility and
  dynamically changing available options (without a full restart).
 
  For high-performance production solr servers, I don't think there is
  much reason for it.  You may want to split the two indexes on to two
  machines.  You may want to run each index in a separate JVM (so if one
  crashes, the other does not)
 
  Maintaining 2 indexes is pretty easy, if that was a larger number or
 you
  need to create indexes for each user in a system then it would be worth
  investigating the multi-core setup (it is still in development)
 
  ryan
 
 
  Pierre-Yves LANDRON wrote:
  Hello,
 
  Until now, i've used two instance of solr, one for each of my
  collections ; it works fine, but i wonder
  if there is an advantage to use multiple indexes in one instance over
  several instances with one index each ?
  Note that the two indexes have different schema.xml.
 
  Thanks.
  PL
 
  Date: Thu, 8 Nov 2007 18:05:43 -0500
  From: [EMAIL PROTECTED]
  To: solr-user@lucene.apache.org
  Subject: Multiple indexes
 
  Hi,
 
  I am looking for the way to utilize the multiple indexes for signle
  sole
  instance.
  I saw that there is the patch 215  available  and would like to ask
  someone
  who knows how to use multiple indexes.
 
  Thanks,
 
  Jae Joo
  _
  Discover the new Windows Vista
  http://search.msn.com/results.aspx?q=windows+vistamkt=en-USform=QBRE
 
 




Solr and Lucene Indexing Performance

2007-11-02 Thread Jae Joo
Hi,

I have 6 millions article to be indexed by Solr and do need your
recommendation.

I do need to parse and generate the Solr based xml file to post it. How
about to use Lucene directly?
I have short testing, it looks like Sola based indexing is faster than
direct indexing through Lucene.

Am I did something wrong and/or does Solr use multiple threading or
something else to get the good indexing performance?

Thanks

Jae Joo


Remote access - Solr index for deleting

2007-10-30 Thread Jae Joo
Hi,

I am trying to delete the document remotly through curl command, but got the
internal server error - Permission Denied.
Anyone knows how to solve this problem?

Thanks,

Jae


Delete index and commit or optimize

2007-10-25 Thread Jae Joo
Hi,

I have 9g index and try to delete a couple of document. The actual deletion
is working fine.

Here is my question.
Do I have to OPTIMIZE the index after deleting? or just COMMIT it? The
original index already optimized.

Thanks,

Jae Joo


Solr Index update - specific field only

2007-10-25 Thread Jae Joo
Hi,

I have index which has the field  NOT stored and would like update some
field which is indexed and stored.
Updating index requires all fields same as original (before updating) with
updated field.
Is there any way to post JUST  UPDATED FIELD ONLY?
Here is an example.
field  indexed  stored
-
item_id  yes yes
searchable yes yes
price yes yes
title  yes yes
description yes no

The way I know to update  the Searchable field from Y to N for item_it
12345.
add
doc
field value=item_id12345/field
field value=searchableY/field
field value=price6699/field
field value=titletitle sample/field
field value=descriptionThis is the detail description of item
/field
/doc
/add

and I am looking the way to update the specific field by

add
doc
field value=item_id12345/field
field value=searchableY/field
/doc
/add
  -- it may keep the unchanged field.

Thanks,

Jae Joo


Re: Syntax for newSearcher query

2007-10-16 Thread Jae Joo
Do I have to define the str name/values as exactly same as acturl query
(order...)?

Here is actual query

indent=onversion=2.2facet=truefacet.mincount=1
facet.field=phys_statefacet.field=sic1facet.limit=-1
sort=sales_volume_us+descq=%28phys_country%3A%22United+States%22%29
start=0rows=20fl=duns_number%2Ccompany_name%2Cphys_address%2C+
phys_state%2C+phys_city%2C+phys_zip%2C+ticker_symbol%2C+status_id_descr%2Cscore
qt=wt=explainOther=hl.fl=


In the newSearch event, I defined as
   listener event=newSearcher class=solr.QuerySenderListener
  arr name=queries
lst
str name=facettrue/str
str name=facet.mincount1/str
str name=facet.fieldphys_state/str
str name=facet.fieldsic1/str
str name=sortsales_volume_us desc/str
str name=qphys_country:United States/str
str name=start0/str
str name=rows20/str
str name=flduns_number, company_name, phys_address, phys_state,
phys_city, phys_
zip, ticker_symbol, status_id_descr, score/str
/lst
  /arr
/listener

But, I am not sure this is working or not (may be not!).

Is there anything else I missed in configuration?

Thanks,

Jae




On 10/10/07, BrendanD [EMAIL PROTECTED] wrote:


 Awesome! Thanks!


 hossman wrote:
 
 
  : looking queries that I'm not quite sure how to specify in my
  solrconfig.xml
  : file in the newSearcher section.
 
  :
 
 rows=20start=0facet.query=attribute_id:1003278facet.query=attribute_id:1003928sort=merchant_count+descfacet=truefacet.field=min_price_cad_rounded_to_tensfacet.field=manufacturer_idfacet.field=merchant_idfacet.field=has_couponfacet.field=has_bundlefacet.field=has_sale_pricefacet.field=has_promofq=product_is_active:truefq=product_status_code:completefq=category_id:1001143qt=sti_dismax_enf.min_price_cad_rounded_to_tens.facet.limit=-1
 
  all you have to do is put each key=val pair as a str
 name=keyval/str
 
  it doesn't matter what the param is, or if it's a param that has
 multiple
  values, just list each of them the same way...
 
  listener event=firstSearcher class=solr.QuerySenderListener
arr name=queries
  lst !-- first query --
str name=rows20/str
str name=start0/str
str name=facet.queryattribute_id:1003278/str
str name=facet.queryattribute_id:1003928/str
  ...
  /lst
  lst !-- second query --
...
 
 
  -Hoss
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Syntax-for-newSearcher-query-tf4604487.html#a13148914
 Sent from the Solr - User mailing list archive at Nabble.com.




Merging Fields

2007-10-05 Thread Jae Joo
Is there any way to merge fields  during indexing time.

I have field1 and field2 and would like to combine these fields and make
field3.
In the document, there are field1 and field2, and I may build field3 using
CopyField.

Thanks,

Jae


Solr - Lucene Query

2007-10-04 Thread Jae Joo
doc
field name=trade1![CDATA[Appraisal Station, The]]/field
/doc

In the schema.xml, this fiend is defined by field name=trade1 type=text
indexed=true  /



Is there any way to find the document by querying - The Appraisal Station?


Thanks,
Jae


Indexing without application server

2007-09-28 Thread Jae Joo
Hi,

I have a multi millions document to be indexed and looking for the way to
index it without j2ee application server.
It is not incremental indexing, this is a kind of Index once, use forever
- all batch mode.

I can guess if there is a way to index it without J2EE, it may be much
faster...

Thanks,

Jae Joo


LockObtainFailedException

2007-09-27 Thread Jae Joo
will anyone help me why and how?


org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
SimpleFSLock@/usr/local/se
archengine/apache-solr-1.2.0/fr_companies/solr/data/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:70)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:579)
at org.apache.lucene.index.IndexWriter.lt;initgt;(IndexWriter.java
:341)
at org.apache.solr.update.SolrIndexWriter.lt;initgt;(
SolrIndexWriter.java:65)
at org.apache.solr.update.UpdateHandler.createMainIndexWriter(
UpdateHandler.java:120)
at org.apache.solr.update.DirectUpdateHandler2.openWriter(
DirectUpdateHandler2.java:181)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(
DirectUpdateHandler2.java:259)
at org.apache.solr.handler.XmlUpdateRequestHandler.update(
XmlUpdateRequestHandler.java:166)
at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody
(XmlUpdateRequestHandler
.java:84)

Thanks,

Jae Joo


Re: LockObtainFailedException

2007-09-27 Thread Jae Joo
In solrconfig.xml,
useCompoundFilefalse/useCompoundFile
mergeFactor10/mergeFactor
maxBufferedDocs25000/maxBufferedDocs
maxMergeDocs1400/maxMergeDocs
maxFieldLength500/maxFieldLength
writeLockTimeout1000/writeLockTimeout
commitLockTimeout1/commitLockTimeout

Does writeLockTimeout too small?

Thanks,

Jae
On 9/27/07, matt davies [EMAIL PROTECTED] wrote:

 quick fix

 look for a lucene lock file in your tmp directory and delete it, then
 restart solr, should start

 I am an idiot though, so be careful, in fact, I'm worse than an
 idiot, I know a little

 :-)

 you got a lock file somewhere though, deleting that will help you
 out, for me it was in my /tmp directory

 On 27 Sep 2007, at 14:10, Jae Joo wrote:

  will anyone help me why and how?
 
 
  org.apache.lucene.store.LockObtainFailedException: Lock obtain
  timed out:
  SimpleFSLock@/usr/local/se
  archengine/apache-solr-1.2.0/fr_companies/solr/data/index/write.lock
  at org.apache.lucene.store.Lock.obtain(Lock.java:70)
  at org.apache.lucene.index.IndexWriter.init
  (IndexWriter.java:579)
  at org.apache.lucene.index.IndexWriter.lt;initgt;
  (IndexWriter.java
  :341)
  at org.apache.solr.update.SolrIndexWriter.lt;initgt;(
  SolrIndexWriter.java:65)
  at org.apache.solr.update.UpdateHandler.createMainIndexWriter(
  UpdateHandler.java:120)
  at org.apache.solr.update.DirectUpdateHandler2.openWriter(
  DirectUpdateHandler2.java:181)
  at org.apache.solr.update.DirectUpdateHandler2.addDoc(
  DirectUpdateHandler2.java:259)
  at org.apache.solr.handler.XmlUpdateRequestHandler.update(
  XmlUpdateRequestHandler.java:166)
  at
  org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody
  (XmlUpdateRequestHandler
  .java:84)
 
  Thanks,
 
  Jae Joo




RAMDirectory

2007-09-22 Thread Jae Joo
HI,

Does any know how to use RAM disk for index?

Thanks,

Jae Joo


Re: caching query result

2007-09-10 Thread Jae Joo
Here is the response XML faceted by multiple fields including state.
response
−
lst name=responseHeader
int name=status0/int
int name=QTime1782/int
−
lst name=params
str name=facet.limit-1/str
str name=wt/
str name=rows10/str
str name=start0/str
str name=sortscore desc/str
str name=facettrue/str
str name=facet.mincount1/str
−
str name=fl
duns_number,company_name,phys_state, phys_city, score
/str
str name=qphys_country:United States/str
str name=qt/
str name=version2.2/str
str name=explainOther/
str name=hl.fl/
−
arr name=facet.field
strsales_range/str
strtotal_emp_range/str
strcompany_type/str
strphys_state/str
strsic1/str
/arr
str name=indenton/str
/lst
/lst

On 9/6/07, Yonik Seeley [EMAIL PROTECTED] wrote:

 On 9/6/07, Jae Joo [EMAIL PROTECTED] wrote:
  I have 13 millions and have facets by states (50). If there is a
 mechasim to
  chche, I may get faster result back.

 How fast are you getting results back with standard field faceting
 (facet.field=state)?



caching query result

2007-09-06 Thread Jae Joo
HI,

I am wondering that is there any way for CACHING FACETS SEARCH Result?

I have 13 millions and have facets by states (50). If there is a mechasim to
chche, I may get faster result back.

Thanks,

Jae


Heap size error during indexing

2007-09-01 Thread Jae Joo
Hi,

I have a Java Heap size problem during indexing for 13 millions doc. under
linux using post.sh (optimized).
each document size is about 2k.

Is there any way to set java heap size in post.sh under tomcat?

Thanks,

Jae Joo


Re: Trouble with Windows / Tomcat install

2007-09-01 Thread Jae Joo
did you build solr.xml in $CATALINA_HOME/conf//Catalina/localhost ?
it yes, please double check the directory information.
And did you copy the apache-solr-1.2.0.war to solr.war in dist directory?

Jae

On 9/1/07, Robin Bonin [EMAIL PROTECTED] wrote:

 Hi all, I followed the instructions in the wiki here,
 http://wiki.apache.org/solr/SolrTomcat
 I know Tomcat is running, but when I pull up my solr admin page, I get
 the following error.


 description The server encountered an internal error () that prevented
 it from fulfilling this request.

 exception org.apache.jasper.JasperException
 org.apache.jasper.servlet.JspServletWrapper.handleJspException(
 JspServletWrapper.java:476)
 org.apache.jasper.servlet.JspServletWrapper.service(
 JspServletWrapper.java:371)
 org.apache.jasper.servlet.JspServlet.serviceJspFile(
 JspServlet.java:315)
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:803)

 root cause javax.servlet.ServletException
 org.apache.jasper.runtime.PageContextImpl.doHandlePageException(
 PageContextImpl.java:846)
 org.apache.jasper.runtime.PageContextImpl.handlePageException(
 PageContextImpl.java:779)
 org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:313)
 org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
 org.apache.jasper.servlet.JspServletWrapper.service(
 JspServletWrapper.java:328)
 org.apache.jasper.servlet.JspServlet.serviceJspFile(
 JspServlet.java:315)
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:803)

 root cause java.lang.NoClassDefFoundError
 org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:80)
 org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
 org.apache.jasper.servlet.JspServletWrapper.service(
 JspServletWrapper.java:328)
 org.apache.jasper.servlet.JspServlet.serviceJspFile(
 JspServlet.java:315)
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:803)

 Apache Tomcat/5.5.23



Re: Trouble with Windows / Tomcat install

2007-09-01 Thread Jae Joo
Solr and tomcat connection does not require any copies and moves of jar
file. All of the jar files are in solr.war file.

Can you send your solr.xml file?
If you use \ instead of /, you have to your \\ to point the solr
instance in solr.xml conf.

Jae

On 9/1/07, Robin Bonin [EMAIL PROTECTED] wrote:

 I tried both solr-1.1 and 1.2, I was having more trouble with 1.2, so
 i went back to 1.1.
 I did copy the war from dist, and renamed to just solr, but I have no
 xml file for solr in conf.
 I was using the java 'option' -Dsolr.solr.home=C:\Solr\

 I just removed the solr war and folder from web apps and moved to
 tomcat\shared\lib and created a solr.xml file under locahost with the
 correct path to the war,and solr folder, and I get the same error.

 I tried changing the paths in the XML to the wrong ones to watch how
 the message changed and I found the problem (mid email)...

 the step 'Copy the contents of the example directory
 c:\temp\solrZip\example\solr\ to c:\web\solr\'

 I had copied everything from the example directory, not example\solr.
 so the path was a directory off

 Thanks for your help.


 On 9/1/07, Jae Joo [EMAIL PROTECTED] wrote:
  did you build solr.xml in $CATALINA_HOME/conf//Catalina/localhost ?
  it yes, please double check the directory information.
  And did you copy the apache-solr-1.2.0.war to solr.war in dist
 directory?
 
  Jae
 
  On 9/1/07, Robin Bonin [EMAIL PROTECTED] wrote:
  
   Hi all, I followed the instructions in the wiki here,
   http://wiki.apache.org/solr/SolrTomcat
   I know Tomcat is running, but when I pull up my solr admin page, I get
   the following error.
  
  
   description The server encountered an internal error () that prevented
   it from fulfilling this request.
  
   exception org.apache.jasper.JasperException
   org.apache.jasper.servlet.JspServletWrapper.handleJspException
 (
   JspServletWrapper.java:476)
   org.apache.jasper.servlet.JspServletWrapper.service(
   JspServletWrapper.java:371)
   org.apache.jasper.servlet.JspServlet.serviceJspFile(
   JspServlet.java:315)
   org.apache.jasper.servlet.JspServlet.service(JspServlet.java
 :265)
   javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
  
   root cause javax.servlet.ServletException
  
 org.apache.jasper.runtime.PageContextImpl.doHandlePageException(
   PageContextImpl.java:846)
   org.apache.jasper.runtime.PageContextImpl.handlePageException(
   PageContextImpl.java:779)
   org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:313)
   org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java
 :98)
   javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
   org.apache.jasper.servlet.JspServletWrapper.service(
   JspServletWrapper.java:328)
   org.apache.jasper.servlet.JspServlet.serviceJspFile(
   JspServlet.java:315)
   org.apache.jasper.servlet.JspServlet.service(JspServlet.java
 :265)
   javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
  
   root cause java.lang.NoClassDefFoundError
   org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:80)
   org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java
 :98)
   javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
   org.apache.jasper.servlet.JspServletWrapper.service(
   JspServletWrapper.java:328)
   org.apache.jasper.servlet.JspServlet.serviceJspFile(
   JspServlet.java:315)
   org.apache.jasper.servlet.JspServlet.service(JspServlet.java
 :265)
   javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
  
   Apache Tomcat/5.5.23
  
 



  1   2   >