RE: java "GC overhead limit exceeded"

2010-07-26 Thread Bastian Spitzer
Hi,

which version do you use? 1.4.1 is highly recommended since previous versions 
contained some
bugs related to memory usage that could lead to memory leaks. i had this gc 
overhead limit
in my setup as well. only workaround that helped was a dayly restart of all 
instances.

with 1.4.1 this issue seems to be fixed.

-Ursprüngliche Nachricht-
Von: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Gesendet: Dienstag, 27. Juli 2010 01:18
An: solr-user@lucene.apache.org
Betreff: java "GC overhead limit exceeded"

I am now occasionally getting a Java "GC overhead limit exceeded" error in my 
Solr. This may or may not be related to recently adding much better (and more) 
warming querries.

I can get it when trying a 'commit', after deleting all documents in my index, 
or in other cases.

Anyone run into this, and have suggestions as to how to set my java options to 
eliminate?  I'm not sure this simply means that my heap size needs to be 
bigger, it seems to be something else.

Any advice appreciated. Googling didn't get me much I trusted.

Jonathan


Re: spell checking....

2010-07-26 Thread satya swaroop
This is in solrconfig.xml:::


  
  default

  solr.IndexBasedSpellChecker

  spell
   ./spellchecker
   0.7
 true
true



  jarowinkler
  lowerfilt
  org.apache.lucene.search.spell.JaroWinklerDistance
  ./spellchecker
  true
  true


  textSpell






 i added the following in standard request handler::



 
   explicit
   
  default
  
  false
  
  false
  
  1

 
  spellcheck


  


Re: Querying throws java.util.ArrayList.RangeCheck

2010-07-26 Thread Yonik Seeley
Do you have any custom code, or is this stock solr (and which version,
and what is the request)?

-Yonik
http://www.lucidimagination.com

On Tue, Jul 27, 2010 at 12:30 AM, Manepalli, Kalyan
 wrote:
> Hi,
>   I am stuck at this weird problem during querying. While querying the solr 
> index I am getting the following error.
> Index: 52, Size: 16 java.lang.IndexOutOfBoundsException: Index: 52, Size: 16 
> at java.util.ArrayList.RangeCheck(ArrayList.java:547) at 
> java.util.ArrayList.get(ArrayList.java:322) at 
> org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:288) at 
> org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:217) at 
> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:948) at 
> org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:506) at 
> org.apache.lucene.index.IndexReader.document(IndexReader.java:947) at 
> org.apache.solr.search.SolrIndexReader.document(SolrIndexReader.java:444) at
>
> During debugging I found that the SolrIndexReader is trying to read a 
> document which doesnt exist in the index.
> I tried optimizing the index and restarting the server but still no luck.
>
> Any help in resolving this issue will be appreciated.
>
> Thanks
> Kalyan


Querying throws java.util.ArrayList.RangeCheck

2010-07-26 Thread Manepalli, Kalyan
Hi,
   I am stuck at this weird problem during querying. While querying the solr 
index I am getting the following error.
Index: 52, Size: 16 java.lang.IndexOutOfBoundsException: Index: 52, Size: 16 at 
java.util.ArrayList.RangeCheck(ArrayList.java:547) at 
java.util.ArrayList.get(ArrayList.java:322) at 
org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:288) at 
org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:217) at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:948) at 
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:506) at 
org.apache.lucene.index.IndexReader.document(IndexReader.java:947) at 
org.apache.solr.search.SolrIndexReader.document(SolrIndexReader.java:444) at 

During debugging I found that the SolrIndexReader is trying to read a document 
which doesnt exist in the index. 
I tried optimizing the index and restarting the server but still no luck.

Any help in resolving this issue will be appreciated.

Thanks
Kalyan

Re: Design questions/Schema Help

2010-07-26 Thread Kiwi de coder
i think the search log will require a lot of storage which may make indexes
size unreasonable large if store in solr.

and the aggregration results may not really fixed in lucene index structure.
:)

kiwi
happy hacking !



On Tue, Jul 27, 2010 at 7:47 AM, Tommy Chheng wrote:

>  Alternatively, have you considered storing(or i should say indexing) the
> search logs with Solr?
>
> This lets you text search across your search queries. You can perform time
> range queries with solr as well.
>
> @tommychheng
> Programmer and UC Irvine Graduate Student
> Find a great grad school based on research interests:
> http://gradschoolnow.com
>
>
>
> On 7/26/10 4:43 PM, Mark wrote:
>
>> We are thinking about using Cassandra to store our search logs. Can
>> someone point me in the right direction/lend some guidance on design? I am
>> new to Cassandra and I am having trouble wrapping my head around some of
>> these new concepts. My brain keeps wanting to go back to a RDBMS design.
>>
>> We will be storing the user query, # of hits returned and their session
>> id. We would like to be able to answer the following questions.
>>
>> - What is the n most popular queries and their counts within the last x
>> (mins/hours/days/etc). Basically the most popular searches within a given
>> time range.
>> - What is the most popular query within the last x where hits = 0. Same as
>> above but with an extra "where" clause
>> - For session id x give me all their other queries
>> - What are all the session ids that searched for 'foos'
>>
>> We accomplish the above functionality w/ MySQL using 2 tables. One for the
>> raw search log information and the other to keep the aggregate/running
>> counts of queries.
>>
>> Would this sort of ad-hoc querying be better implemented using Hadoop +
>> Hive? If so, should I be storing all this information in Cassandra then
>> using Hadoop to retrieve it?
>>
>> Thanks for your suggestions
>>
>>


StatsComponent and sint?

2010-07-26 Thread Jonathan Rochkind
Man, what types of fields is StatsComponent actually known to work with?

With an sint, it seems to have trouble if there are any documents with null 
values for the field. It appears to decide that a null/empty/blank value is 
-1325166535, and is thus the minimum value. 

At least if I'm interpreting what's going on right. Anyone run into this?

RE: java "GC overhead limit exceeded"

2010-07-26 Thread Jonathan Rochkind
> Short answer: "GC overhead limit exceeded" means "out of memory".

Aha, thanks. So the answer is just "raise your Xmx/heap size, you need more 
memory to do what you're doing", yeah?

Jonathan



Is there a cache for a query?

2010-07-26 Thread Li Li
I want a cache to cache all result of a query(all steps including
collapse, highlight and facet).  I read
http://wiki.apache.org/solr/SolrCaching, but can't find a global
cache. Maybe I can use external cache to store key-value. Is there any
one in solr?


Re: java "GC overhead limit exceeded"

2010-07-26 Thread Yonik Seeley
On Mon, Jul 26, 2010 at 7:17 PM, Jonathan Rochkind  wrote:
> I am now occasionally getting a Java "GC overhead limit exceeded" error in
> my Solr. This may or may not be related to recently adding much better (and
> more) warming querries.

When memory gets tight, the JVM kicks of a garbage collection to try
and free more space (and it normally can free at least some).  When
only a little memory is freed, and GC keeps kicking in, it starts to
eat up a majority of the CPU time and the JVM gives up with a "GC
overhead limit exceeded" error.

Short answer: "GC overhead limit exceeded" means "out of memory".

-Yonik
http://www.lucidimagination.com


Re: Updating fields in Solr

2010-07-26 Thread Erick Erickson
See below:

On Mon, Jul 26, 2010 at 11:49 AM, Pramod Goyal wrote:

> Hi,
> I have a requirement where i need to keep updating certain fields in
> the schema. My requirement is to change some of the fields or add some
> values to a field ( multi-value field ).  I understand that i can use Solr
> update for this. If i am using Solr update do i need to publish the entire
> document again or do i just need to publish the updated fields. Again in
> case of update can i add( append ) news values to the existing fields ?
>

updating a document does, indeed, require that you reindex the whole thing.
There's no capability to just update a field.


>
>In my document most of the parts remains unchanged however few fields
> keeps changing. Will it be costly to update the entire document just to
> change a field ? I was wondering if i should create to solr core one for
> static content and another one for dynamic content. This way i can reduce
> the time taken to update a document, but it introduces the complexity of
> querying different core and combining the result on the client side.
>

Do you care how costly it is? By that I mean what is your expected update
rate, how big is your index, etc. If you're updating 1 document a day you
don't
care. If you're updating 100/sec, you care very much. In between it's an
interesting
question :).

Multiple cores are a possibility, but you're right that's more complex. I'd
really
evaluate (by gathering statistics) whether you need to before trying it.


>
> Is there a way to configure solr so that client can execute a single
> query and solr internally executes multiple queries across different cores
> and return single result ?
>

I'll leave this one to someone else...

Best
Erick


Re: Total number of terms in an index?

2010-07-26 Thread Chris Hostetter

: Sorry, like the subject, I mean the total number of terms.

it's not stored anywhere, so the only way to fetch it is to actually 
iteate all of the terms and count them (that's why LukeRequestHandler is 
slow slow to compute this particular value)

If i remember right, someone mentioned at one point that flex would let 
you store data about stuff like this in your index as part of the segment 
writing, but frankly i'm still not sure how that iwll help -- because you 
unless your index is fully optimized, you still have to iterate the terms 
in each segment to 'de-dup' them.


-Hoss



Re: spell checking....

2010-07-26 Thread Erick Erickson
It's almost impossible to analyze this kind of thing without seeing your
schema and debug output. You might want to review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Mon, Jul 26, 2010 at 9:56 AM, satya swaroop  wrote:

> hi all,
>i am a new one to solr and able to implement indexing the documents
> by following the solr wiki. now i am trying to add the spellchecking. i
> followed the spellcheck component in wiki but not getting the suggested
> spellings. i first build it by spellcheck.build=true,...
>
> here i give u the example:::
>
>
> http://localhost:8080/solr/spell?q=javs&spellcheck=true&spellcheck.collate=true
>
> 
> 
> -
> 
>
> 
> 
> 
> 
>
>
> here the response should actualy suggest the "java" but didnt..
>
> can any one guide me about it...
>  i am using solr 1.4, tomcat in ubuntu
>
>
>
>
>
> Regards,
> swarup
>


Re: Similar search regarding a result document

2010-07-26 Thread Erick Erickson
I need much more detailed information before I can make sense of your use
case.
Could you provide some sample?

MoreLikeThis sounds in the right neighborhood, but I'm guessing.

Best
Erick

On Mon, Jul 26, 2010 at 9:02 AM,  wrote:

>
>  Hi,
>
> I would like to implement a similar search feature... but not relative to
> the initial search query but relative to each resuts documents.
>
> The structure of each doc is:
> id
> title
> content
> price
> etc...
>
> Then we have a database of global search seach queries, i'm thinking to
> integrate this in solr.
>
> I'm planing to implement this as a query of a query... but before i would
> like to know if there is a built in function in Solr for this?
>
> Thanks for your help.
>
>
>
>
>
>
>
>
>
>


Re: question about relevance

2010-07-26 Thread Erick Erickson
I'm having trouble getting my head around what you're trying to accomplish,
so if this is off base you know why .

But what it smells like is that you're trying to do database-ish things in
a SOLR index, which is almost always the wrong approach. Is there a
way to index redundant data with each document so all you have to do
to get the "relevant" users is a simple query?

Adding scores is also suspect.. I don't see how that does predictable
things.

But I'm also failing completely to understand what a "relevant" user is.

not much help, if this is way off base perhaps you could provide some
additional use-cases?

Best
Erick

On Mon, Jul 26, 2010 at 2:37 AM, Bharat Jain  wrote:

> Hello All,
>
> I have a index which store multiple objects belonging to a user
>
> for e.g.
> 
>   -> Identifies user
> object type e.g. userBasic or userAdv
>
>  
>> MAPS to userBasicInfoObject
>  
>
>  
>   -> MAPS to userAdvInfoObject
>  
>
> 
>
>
> Now when I am doing some query I get multiple records mapping to java
> objects (identified by objType) that belong to the same user.
>
>
> Now I want to show the relevant users at the top of the list. I am thinking
> of adding the Lucene scores of different result documents to get the best
> scores. Is this correct approach to get the relevance of the user?
>
> Thanks
> Bharat Jain
>


Solr crawls during replication

2010-07-26 Thread Mark
We have an index around 25-30G w/ 1 master and 5 slaves. We perform 
replication every 30 mins. During replication the disk I/O obviously 
shoots up on the slaves to the point where all requests routed to that 
slave take a really long time... sometimes to the point of timing out.


Is there any logical or physical changes we could make to our 
architecture to overcome this problem?


Thanks


Re: Design questions/Schema Help

2010-07-26 Thread Tommy Chheng
 Alternatively, have you considered storing(or i should say indexing) 
the search logs with Solr?


This lets you text search across your search queries. You can perform 
time range queries with solr as well.


@tommychheng
Programmer and UC Irvine Graduate Student
Find a great grad school based on research interests: http://gradschoolnow.com


On 7/26/10 4:43 PM, Mark wrote:
We are thinking about using Cassandra to store our search logs. Can 
someone point me in the right direction/lend some guidance on design? 
I am new to Cassandra and I am having trouble wrapping my head around 
some of these new concepts. My brain keeps wanting to go back to a 
RDBMS design.


We will be storing the user query, # of hits returned and their 
session id. We would like to be able to answer the following questions.


- What is the n most popular queries and their counts within the last 
x (mins/hours/days/etc). Basically the most popular searches within a 
given time range.
- What is the most popular query within the last x where hits = 0. 
Same as above but with an extra "where" clause

- For session id x give me all their other queries
- What are all the session ids that searched for 'foos'

We accomplish the above functionality w/ MySQL using 2 tables. One for 
the raw search log information and the other to keep the 
aggregate/running counts of queries.


Would this sort of ad-hoc querying be better implemented using Hadoop 
+ Hive? If so, should I be storing all this information in Cassandra 
then using Hadoop to retrieve it?


Thanks for your suggestions



Re: Design questions/Schema Help

2010-07-26 Thread Mark

On 7/26/10 4:43 PM, Mark wrote:
We are thinking about using Cassandra to store our search logs. Can 
someone point me in the right direction/lend some guidance on design? 
I am new to Cassandra and I am having trouble wrapping my head around 
some of these new concepts. My brain keeps wanting to go back to a 
RDBMS design.


We will be storing the user query, # of hits returned and their 
session id. We would like to be able to answer the following questions.


- What is the n most popular queries and their counts within the last 
x (mins/hours/days/etc). Basically the most popular searches within a 
given time range.
- What is the most popular query within the last x where hits = 0. 
Same as above but with an extra "where" clause

- For session id x give me all their other queries
- What are all the session ids that searched for 'foos'

We accomplish the above functionality w/ MySQL using 2 tables. One for 
the raw search log information and the other to keep the 
aggregate/running counts of queries.


Would this sort of ad-hoc querying be better implemented using Hadoop 
+ Hive? If so, should I be storing all this information in Cassandra 
then using Hadoop to retrieve it?


Thanks for your suggestions 

Whoops wrong forum


Design questions/Schema Help

2010-07-26 Thread Mark
We are thinking about using Cassandra to store our search logs. Can 
someone point me in the right direction/lend some guidance on design? I 
am new to Cassandra and I am having trouble wrapping my head around some 
of these new concepts. My brain keeps wanting to go back to a RDBMS design.


We will be storing the user query, # of hits returned and their session 
id. We would like to be able to answer the following questions.


- What is the n most popular queries and their counts within the last x 
(mins/hours/days/etc). Basically the most popular searches within a 
given time range.
- What is the most popular query within the last x where hits = 0. Same 
as above but with an extra "where" clause

- For session id x give me all their other queries
- What are all the session ids that searched for 'foos'

We accomplish the above functionality w/ MySQL using 2 tables. One for 
the raw search log information and the other to keep the 
aggregate/running counts of queries.


Would this sort of ad-hoc querying be better implemented using Hadoop + 
Hive? If so, should I be storing all this information in Cassandra then 
using Hadoop to retrieve it?


Thanks for your suggestions


Re: NullPointerException with CURL, but not in browser

2010-07-26 Thread Chris Hostetter

: However, when I'm trying this very URL with curl within my (perl) script, I
: receive a NullPointerException:
: CURL-COMMAND: curl -sL
: 
http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard

it appears you aren't quoting the URL, so that first "&" character  is 
causing the shell to think yo uare done with the command, and you want it 
to be backgrounded (allthough i'm not certain, since it depends on how you 
are having perl execute curl)

i would suggest that you avoid exec/system calls to "curl" from Perl, and 
use an LWP::UserAgent instead.


-Hoss



java "GC overhead limit exceeded"

2010-07-26 Thread Jonathan Rochkind
I am now occasionally getting a Java "GC overhead limit exceeded" error 
in my Solr. This may or may not be related to recently adding much 
better (and more) warming querries.


I can get it when trying a 'commit', after deleting all documents in my 
index, or in other cases.


Anyone run into this, and have suggestions as to how to set my java 
options to eliminate?  I'm not sure this simply means that my heap size 
needs to be bigger, it seems to be something else.


Any advice appreciated. Googling didn't get me much I trusted.

Jonathan


Re: Total number of terms in an index?

2010-07-26 Thread Jason Rutherglen
Sorry, like the subject, I mean the total number of terms.

On Mon, Jul 26, 2010 at 4:03 PM, Jason Rutherglen
 wrote:
> What's the fastest way to obtain the total number of docs from the
> index?  (The Luke request handler takes a long time to load so I'm
> looking for something else).
>


Total number of terms in an index?

2010-07-26 Thread Jason Rutherglen
What's the fastest way to obtain the total number of docs from the
index?  (The Luke request handler takes a long time to load so I'm
looking for something else).


NullPointerException with CURL, but not in browser

2010-07-26 Thread Rene Rath
Hi *,

I'd like to see how many documents I have in my index with a certain ListId,
in this example ListId 881.

http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard

In the browser, the output looks perfect, I indeed have 3 matching documents
in the index:



0
4097

*,score
on
0
*
standard
standard
ListId:881
2.2
0





However, when I'm trying this very URL with curl within my (perl) script, I
receive a NullPointerException:
CURL-COMMAND: curl -sL
http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard




Error 500 

HTTP ERROR: 500null

java.lang.NullPointerException
at java.io.StringReader.(StringReader.java:33)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:197)
...

Grateful for any kind of help.

cheers - MOPS


Re: How to Combine Drupal solrconfig.xml with Nutch solrconfig.xml?

2010-07-26 Thread David Stuart
Hi Savannah,

I have just answered this question over on drupal.org. 
http://drupal.org/node/811062

Response number 5 and 11 will help you. On the solrconfig.xml side of things 
you will only really need Drupal's version.

Although still in alpha my Nutch module will help you out with integration 
http://drupal.org/project/nutch

Regards,

David Stuart

On 26 Jul 2010, at 21:37, Savannah Beckett wrote:

> I am using Drupal ApacheSolr module to integrate solr with drupal.  I already 
> integrated solr with nutch.  I already moved nutch's solrconfig.xml and 
> schema.xml to solr's example directory, and it work.  I tried to append 
> Drupal's 
> ApacheSolr module's own solrconfig.xml and schema.xml into the same xml 
> files, 
> but I got the following error when I "java -jar start.jar":
>  
> Jul 26, 2010 1:18:31 PM org.apache.solr.common.SolrException log
> SEVERE: Exception during parsing file: 
> solrconfig.xml:org.xml.sax.SAXParseException: The markup in the document 
> following the root element must be well-formed.
> at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
> at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
> 
> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
> at org.apache.solr.core.Config.(Config.java:110)
> at org.apache.solr.core.SolrConfig.(SolrConfig.java:130)
> at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134)
> 
> at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
> 
> Why?  does solrconfig.xml allow to have 2  sections?  does schema.xml 
> allow to have 2  sections?  
> 
> Thanks.
> 
> 



How to Combine Drupal solrconfig.xml with Nutch solrconfig.xml?

2010-07-26 Thread Savannah Beckett
I am using Drupal ApacheSolr module to integrate solr with drupal.  I already 
integrated solr with nutch.  I already moved nutch's solrconfig.xml and 
schema.xml to solr's example directory, and it work.  I tried to append 
Drupal's 
ApacheSolr module's own solrconfig.xml and schema.xml into the same xml files, 
but I got the following error when I "java -jar start.jar":
 
Jul 26, 2010 1:18:31 PM org.apache.solr.common.SolrException log
SEVERE: Exception during parsing file: 
solrconfig.xml:org.xml.sax.SAXParseException: The markup in the document 
following the root element must be well-formed.
    at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
    at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)

    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
    at org.apache.solr.core.Config.(Config.java:110)
    at org.apache.solr.core.SolrConfig.(SolrConfig.java:130)
    at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134)

    at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)

Why?  does solrconfig.xml allow to have 2  sections?  does schema.xml 
allow to have 2  sections?  

Thanks.


  

Solr 3.1 and ExtractingRequestHandler resulting in blank content

2010-07-26 Thread David Thibault
Hello all,

I’m working on a project with Solr.  I had 1.4.1 working OK using 
ExtractingRequestHandler except that it was crashing on some PDFs.  I noticed 
that Tika bundled with 1.4.1 was 0.4, which was kind of old.  I decided to try 
updating to 0.7 as per the directions here: 
http://wiki.apache.org/solr/ExtractingRequestHandler  but it was giving me 
errors (I forget what they were specifically).

Then I tried downloading Solr 3.1 from the source repository, which I noticed 
came with Tika 0.7.  I figured this would be an easier route to get working.  
Now I’m testing with 3.1 and 0.7 and I’m noticing my documents are going into 
Solr OK, but they all have blank content (no document text stored in Solr).  I 
did see that the default “text” field is not stored. Changing that to 
stored=true didn’t help.  Changing to 
fmap.content=attr_content&uprefix=attr_content didn’t help either.  I have 
attached all relevant info here.  Please let me know if someone sees something 
I don’t (it’s entirely possible as I’m relatively new to Solr).

Schema.xml:


  
























  

  


  






  
  







  


  








  


  




  
  





  


  





  
  





  


  




  



  


  



  


  


  


  






 


 
   
   
   
   
   
   
   
   

   
   
   
   

   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

   
   
   
   
   
   
   
   

   

   
   

   
   
   
   
   

   

   
   

   
 
 id

 text

 

   
   
   
   
   
   
   
   



Solrconfig.xml:


  
${solr.abortOnConfigurationError:true}

  LUCENE_31

  
  
  
  C:/Program Files/Apache Software Foundation/solr-3.1/data
  
  
false

10
32
1
1000
1
native

  

  

false
32
10

false

true



  1
  0



 false

  

  

   

  



  
1024



true
20
200

  
  


  
 solr rocks010
static firstSearcher warming query from 
solrconfig.xml
  

false
2

  
  



  
  

 
   explicit
   
 
  
  
 
   velocity

   browse
   layout
   Solritas

   dismax
   *:*
   10
   *,score

   on
   cat
   manu_exact
   1
   
  text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
   

   on
   text features name
   0
   name
 
  
  

 dismax
 explicit
 0.01
 
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 
 
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
 
 
popularity^0.5 recip(price,1,1000,1000)^0.3
 
 
id,name,price,score
 
 
2<-1 5<-2 6<90%
 
 100
 *:*
 
 text features name
 
 0
 
 name
 regex 

  
  

 dismax
 explicit
 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0
 2<-1 5<-2 6<90%
 
 incubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2


  inStock:true


  cat
  manu_exact
  price:[* TO 500]
  price:[500 TO *]

  
  

textSpell


  default
  name
  ./spellchecker

  
  

  false
  false
  1


  spellcheck

  

  
  

  true


  tvComponent

  
  

  
  default
  org.carrot2.clustering.lingo.LingoClusteringAlgorithm
  20


  stc
  org.carrot2.clustering.stc.STCClusteringAlgorithm

  
  
 
   true
   default
   true
   
   name
   id
   
   features
   
   true
   
   
   
   false


  clusteringComponent

  

  

  text
  true
  ignored_

  
  true
  links
  ignored_

  


  

  
 
  true


  termsComponent

  
  

string
elevate.xml
  

  
  

  explicit


  elevator

  
  


  

  
  
  
  
  

  standard
  solrpingquery
  all

  

  

 explicit 
 true

  
  
  
   

 100

   

   

  
  70
  
  0.5
  
  [-\w ,/\n\"']{20,200}

   

   

 
 

   

   

   

   
  
  
  
  

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

ah okay thx =)

the class "SolrInputDocuments" is only for indexing an document and
"SolrDocuement" for the search ?

when Solr index an document first step is to create an SolrInputDocument.
then in class "DocumentBuilder" creates solr in function "Document
toDocument (SolrInputDoc, Schema)"
an Lucene Document ?! 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p997196.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: slave index is bigger than master index

2010-07-26 Thread Chris Hostetter

: No I didn't. I thought you aren't supposed to run optimize on slaves. Well

correct, you should make all changes to the master.

: but it doesn;t matter now, as I think its fixed now. I just added a dummy
: document on master, ran a commit call and then once that executed ran an
: optimize call. This triggered snapshooter to replicate the index, which
: somehow resulted in normal index size at slaves. 

My hunch: are you running on windows?

Windows filesystems have issues with trying to delete a file while 
processes still have the file handle open.  Since Solr needs those "old" 
filehandles to continue serving requests while it opens up the "new" copy 
of the index, those files wind up left on disk.  the *next* time a new 
index is opened, it tries to delete those files again, and then they 
succeed...

http://wiki.apache.org/lucene-java/LuceneFAQ#Why_do_I_have_a_deletable_file_.28and_old_segment_files_remain.29_after_running_optimize.3F

...if you notice this situation happen again, check and see if you have a 
"deletables" file.

-Hoss



Re: Can't find org.apache.solr.client.solrj.embedded

2010-07-26 Thread Chris Hostetter

: where is a Jar, containing org.apache.solr.client.solrj.embedded?

Classes in the embedded package are useless w/o the rest of the Solr 
internal "core" classes, so they are included directly in the 
apache-solr-core-1.4.1.jar.

(i know .. the directory structure doesn't make a lot of sense)

: Also I can't find any other sources than
: 
>http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src/org/apache/solr/client/solrj/embedded/
: , which does not fit to Solr 1.4.

All the source code for Solr 1.4.1 is included in the 1.4.1 release 
artifacts (the tgz or zip files) .. if you want to find it in SVN it's 
located here...

https://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/



-Hoss



Re: Solr Doc Lucene Doc !?

2010-07-26 Thread Chris Hostetter
: i want to learn more about the technology.
: 
: exists an issue to create really an solrDoc ? Or its in the code only for a
: better understanding of the lucene and solr border ? 

There is a real and actual class named "SolrDocument".  it is a simpler 
object then Lucene's "Document" class becuase in Solr the details about 
the field types (stored, indexed, etc...) are handled by the schema, and 
are not distinct per Field instance.

http://lucene.apache.org/solr/api/org/apache/solr/common/SolrDocument.html

: -- 
: View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p99.html
: Sent from the Solr - User mailing list archive at Nabble.com.
: 



-Hoss



Extracting PDF text/comment/callout/typewriter boxes with Solr CELL/Tika/PDFBox

2010-07-26 Thread Sharp, Jonathan

Every so often I need to index new batches of scanned PDFs and occasionally 
Adobe's OCR can't recognize the text in a couple of these documents. In these 
situations I would like to type in a small amount of text onto the document and 
have it be extracted by Solr CELL.  

Adobe Pro 9 has a number of different ways to add text directly to a PDF file:

*Typewriter
*Sticky Note
*Callout boxes
*Text boxes

I tried indexing documents with each of these text additions with Solr 1.4.1 + 
Solr CELL but can't extract the text in any of these boxes. 

If someone has modified their Solr CELL installation to use more recent 
versions of Tika (above 0.4) or PDFBox (above 0.7.3) and/or can can comment on 
whether newer versions can pull the text out of any of these various text boxes 
I'd appreciate that very much.

-Jon




-
SECURITY/CONFIDENTIALITY WARNING:  
This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to receive further communications via e-mail, please reply to this message 
and inform the sender that you do not wish to receive further e-mail from the 
sender. 

-



Re: 2 type of docs in same schema?

2010-07-26 Thread Geert-Jan Brits
I still assume that what you mean by "search queries data" is just some
other form of document (in this case containing 1 seach-request per
document)
I'm not sure what you intend to do by that actually, but yes indexing stays
the same (you probably want to mark field "type" as required so you don't
forget to include in in your indexing-program)

2010/7/26 

>
>  Thanks for you answer! That's great.
>
> Now to index search quieries data is there something special to do? or it
> stay as usual?
>
>
>
>
>
>
>
>
> -Original Message-
> From: Geert-Jan Brits 
> To: solr-user@lucene.apache.org
> Sent: Mon, Jul 26, 2010 4:57 pm
> Subject: Re: 2 type of docs in same schema?
>
>
> You can easily have different types of documents in 1 core:
>
> 1. define searchquery as a field(just as the others in your schema)
> 2. define type as a field (this allows you to decide which type of
> documents
> to search for, e.g: "type_normal" or "type_search")
>
> now searching on regular docs becomes:
> q=title:some+title&fq=type:type_normal
>
> and searching for searchqueries becomes (I think this is what you want):
> q=searchquery:bmw+car&fq=type:type_search
>
> Geert-Jan
>
> 2010/7/26 
>
> >
> >
> >
> >  I need you expertise on this one...
> >
> > We would like to index every search query that is passed in our solr
> engine
> > (same core)
> >
> > Our docs format are like this (already in our schema):
> > title
> > content
> > price
> > category
> > etc...
> >
> > Now how to add "search queries" as a field in our schema? Know that the
> > search queries won't have all the field above?
> > For example:
> > q=bmw car
> > q=car wheels
> > q=moto honda
> > etc...
> >
> > Should we run an other core that only index search queries? or is there a
> > way to do this with same instance and same core?
> >
> > Thanks for your help
> >
> >
> >
>
>
>


Re:Re: How to speed up solr search speed

2010-07-26 Thread Dennis Gearon
Isn't it always one of these three? (from most likely to least likely, 
generally)

Memory
Disk Speed
WebServer and it's code
CPU.

Memory and Disk are related, as swapping occurs between them. As long as memory 
is high enough, it becomes:

Disk Speed
WebServer and it's code
CPU

If the WebServer is configured to be as fast as is possible,THEN the CPU comes 
into play.

So normally:

1/ Put enough memory in so it doesn't swap
2/ Buy the fastest damn disk/diskArrays/SolidState/HyperDrive RamDisk/RAIDed 
HyperDrive RamDisk that you can afford.
3/ Tune your webserver code.

1 GOOD *LAPTOP* with 8-16 gig of ram(with a 64bit OS), and an single, external 
SATA HyperDrive 64Gig RamDrive is SCREAMING, way beyond most single server 
boxes you'll pay to get hosting on.


Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Fri, 7/16/10, marship  wrote:

> From: marship 
> Subject: Re:Re: How to speed up solr search speed
> To: solr-user@lucene.apache.org
> Date: Friday, July 16, 2010, 11:26 AM
> Hi. Peter. 
> 
>  Thanks for replying.
> 
> 
> >Hi Scott!
> >
> >> I am aware these cores on same server are
> interfering with each other.
> >
> >Thats not good. Try to use only one core per CPU. With
> more per CPU you
> >won't have any benefits over the single-core version, I
> think.
> 
>  I only have 2 servers, each CPU with 8 cores. Each server
> has 6G memory. So I have 16 CPU core in total. But I have 70
> solr cores so I have to use them on my 2 servers. Based on
> my observation, even when the search is processing, the CPU
> usage is not high. The memory usage is not high too. Each
> solr(jetty) instance on consume 40M-60M memory. My server
> always have 2-3G memory availabe.
> >
> >> can solr use more memory to avoid disk operation
> conflicts?
> >
> >Yes, only the memory you have on the machine of course.
> Are you using
> >tomcat or jetty?
> >
> 
> I am using jetty.
> >> For my case, I don't think solr can work as fast
> as 100-200ms on average.
> >
> >We have indices with a lot entries not as large as
> yours, but in the
> >range of X Million. and have response times under
> 100ms.
> >What about testing only one core with 5-10 Mio docs? If
> the response
> >time isn't any better maybe you need a different field
> config or sth.
> >different is wrong?
> 
> For the moment, I really don't know. I tried to use java
> -sever -jar start.jar to start jetty/solr. I saw when solr
> start, sometimes some core search for simple keyword like
> "design" will take 70s, of course some only take 0-15ms.
> From my aspect, I do believe it is the harddisk accessed by
> these cores deplays each other. So finally some cores fall
> behind. But the bad news for me is the solr distriubted
> search's speed is decided by the slowest one. 
> 
> 
> >
> >> So should I add it or the default(without it ) is
> ok?
> >
> >Without is also okay -> solr uses default.
> >With 75 Mio docs it should around 20 000 but I guess
> there is sth.
> >different wrong: maybe caching or field definition.
> Could you post the
> >latter one?
> >
> 
> Sorry. What are you asking me to post?
> 
>  
> 
> 
> >Regards,
> >Peter.
> >
> >> Hi. Peter.
> >> I think I am not using faceting, highlighting ...
> I read about them
> >> but don't know how to work with them. I am using
> the default "example"
> >> just change the indexed fields.
> >> For my case, I don't think solr can work as fast
> as 100-200ms on
> >> average. I tried some keywords on only single solr
> instance. It
> >> sometimes takes more than 20s. I just input 4
> keywords. I agree it is
> >> keyword concerns. But the issue is it doesn't work
> consistently.
> >>
> >> When 37 instances on same server works at same
> time (when a
> >> distributed search start), it goes worse, I saw
> some solr cores
> >> execute very fast, 0ms, ~40ms, ~200ms. But more
> solr cores executed as
> >> ~2500ms, ~3500ms, ~6700ms. and about 5-10 solr
> cores need more than
> >> 17s. I have 70 cores running. And the search speed
> depends on the
> >> SLOWEST one. Even 69 cores can run at 1ms. but
> last one need 50s. then
> >> the distributed search speed is 50s.
> >> I am aware these cores on same server are
> interfering with each other.
> >> As I have lots of free memory. I want to know,
> with the prerequisite,
> >> can solr use more memory to avoid disk operation
> conflicts?
> >>
> >> Thanks.
> >> Regards.
> >> Scott
> >>
> >> 在2010-07-15 17:19:57,"Peter Karich" 
> 写道:
> >>> How does your queries look like? Do you use
> faceting, highlighting, ... ?
> >>> Did you try to customize the cache?
> >>> Setting the HashDocSet to "0.005 of all
> documents" improves our
> >>> search speed a lot.
> >>> Did you optimize the index?
> >>>
> >>> 500ms seems to be slow for an 'average'
> search. I am not an expert
> >>> but without highlighting it should be faster
> as 100ms or at least 200ms
> >>>
> >>> Regards,

Re: how to Protect data

2010-07-26 Thread Dennis Gearon
If it's not the data that's being searched, you can alway encode it before 
inserting it. You either have to either fruther encode it to base64 to make it 
printable before storing it, OR use a binary field.

You probably could also set up an external process that cycles through every 
document in the index, encodes the fields in question and reinserts the 
document. The time and horse power to do that might be better spent 
regenerating the index from scratch with the newly encoded documents.

You might even be able to modify something in Solr/Lucene to do the enocding 
automatically using Java. Java must have encryption libraries like most other 
languages.

I don't know solr/lucene well enough to say, but the data that's in the 
searchable columns must be visible as well, in some manner. I don't know how 
understandable it is after being tokenized. Someone else would have to comment 
on that.

Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Sun, 7/25/10, Girish Pandit  wrote:

> From: Girish Pandit 
> Subject: how to Protect data
> To: solr-user@lucene.apache.org
> Date: Sunday, July 25, 2010, 5:12 PM
> Hi,
> 
> I was being ask about protecting data, means that the
> search index data is stored in the some indexed files and
> when you open those indexed files, I can clearly see the
> data, means some texts, e.g. name, address, postal code
> etc.
> 
> is there anyway I can hide the data? means some kind of
> data encoding to not even see any text raw data.
> 
> -Girish
> 
> 


Re: 2 type of docs in same schema?

2010-07-26 Thread scrapy

 Thanks for you answer! That's great.

Now to index search quieries data is there something special to do? or it stay 
as usual?

 


 

 

-Original Message-
From: Geert-Jan Brits 
To: solr-user@lucene.apache.org
Sent: Mon, Jul 26, 2010 4:57 pm
Subject: Re: 2 type of docs in same schema?


You can easily have different types of documents in 1 core:

1. define searchquery as a field(just as the others in your schema)
2. define type as a field (this allows you to decide which type of documents
to search for, e.g: "type_normal" or "type_search")

now searching on regular docs becomes:
q=title:some+title&fq=type:type_normal

and searching for searchqueries becomes (I think this is what you want):
q=searchquery:bmw+car&fq=type:type_search

Geert-Jan

2010/7/26 

>
>
>
>  I need you expertise on this one...
>
> We would like to index every search query that is passed in our solr engine
> (same core)
>
> Our docs format are like this (already in our schema):
> title
> content
> price
> category
> etc...
>
> Now how to add "search queries" as a field in our schema? Know that the
> search queries won't have all the field above?
> For example:
> q=bmw car
> q=car wheels
> q=moto honda
> etc...
>
> Should we run an other core that only index search queries? or is there a
> way to do this with same instance and same core?
>
> Thanks for your help
>
>
>

 


Updating fields in Solr

2010-07-26 Thread Pramod Goyal
Hi,
 I have a requirement where i need to keep updating certain fields in
the schema. My requirement is to change some of the fields or add some
values to a field ( multi-value field ).  I understand that i can use Solr
update for this. If i am using Solr update do i need to publish the entire
document again or do i just need to publish the updated fields. Again in
case of update can i add( append ) news values to the existing fields ?

In my document most of the parts remains unchanged however few fields
keeps changing. Will it be costly to update the entire document just to
change a field ? I was wondering if i should create to solr core one for
static content and another one for dynamic content. This way i can reduce
the time taken to update a document, but it introduces the complexity of
querying different core and combining the result on the client side.

 Is there a way to configure solr so that client can execute a single
query and solr internally executes multiple queries across different cores
and return single result ?


Re: Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

i want to learn more about the technology.

exists an issue to create really an solrDoc ? Or its in the code only for a
better understanding of the lucene and solr border ? 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p99.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: slave index is bigger than master index

2010-07-26 Thread Bastian Spitzer
as far as i know this is not needed, the optimized index is automatically 
replicated to the
slaves. therefore something seems to be really wrong with your setup. maybe the 
slave index
got corrupted for some reason? did u try deleting the data dir + slave restart 
for a fresh
replicated index? maybe worth a try.. 

good luck

-Ursprüngliche Nachricht-
Von: Peter Karich [mailto:peat...@yahoo.de] 
Gesendet: Montag, 26. Juli 2010 16:54
An: solr-user@lucene.apache.org
Betreff: Re: slave index is bigger than master index

did you try an optimize on the slave too?

> Yes I always run an optimize whenever I index on master. In fact I 
> just ran an optimize command an hour ago, but it didn't make any difference.
>   




Re: slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

No I didn't. I thought you aren't supposed to run optimize on slaves. Well
but it doesn;t matter now, as I think its fixed now. I just added a dummy
document on master, ran a commit call and then once that executed ran an
optimize call. This triggered snapshooter to replicate the index, which
somehow resulted in normal index size at slaves. 

I still don't get what exactly happened there, and will be investigating
into this. If I do find anything interesting, will update on this mailing
list.

Thanks for all your input anyways,

-Muneeb 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996611.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: 2 type of docs in same schema?

2010-07-26 Thread Geert-Jan Brits
You can easily have different types of documents in 1 core:

1. define searchquery as a field(just as the others in your schema)
2. define type as a field (this allows you to decide which type of documents
to search for, e.g: "type_normal" or "type_search")

now searching on regular docs becomes:
q=title:some+title&fq=type:type_normal

and searching for searchqueries becomes (I think this is what you want):
q=searchquery:bmw+car&fq=type:type_search

Geert-Jan

2010/7/26 

>
>
>
>  I need you expertise on this one...
>
> We would like to index every search query that is passed in our solr engine
> (same core)
>
> Our docs format are like this (already in our schema):
> title
> content
> price
> category
> etc...
>
> Now how to add "search queries" as a field in our schema? Know that the
> search queries won't have all the field above?
> For example:
> q=bmw car
> q=car wheels
> q=moto honda
> etc...
>
> Should we run an other core that only index search queries? or is there a
> way to do this with same instance and same core?
>
> Thanks for your help
>
>
>


Re: slave index is bigger than master index

2010-07-26 Thread Peter Karich
did you try an optimize on the slave too?

> Yes I always run an optimize whenever I index on master. In fact I just ran
> an optimize command an hour ago, but it didn't make any difference.
>   




Re: DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-26 Thread MitchK

Hi Chantal,

did you tried to write a  http://wiki.apache.org/solr/DIHCustomFunctions
custom DIH Function ?
If not, I think this will be a solution.
Just check, whether "${prog.vip}" is an empty string or null.
If so, you need to replace it with a value that never can response anything.

So the vip-field will always be empty for such queries. 
Maybe that helps?

Hopefully, the variable resolver is able to resolve something like
${dih.functions.getReplacementIfNeeded(prog.vip).

Kind regards,
- Mitch



Chantal Ackermann wrote:
> 
> Hi,
> 
> my use case is the following:
> 
> In a sub-entity I request rows from a database for an input list of
> strings:
> 
>/* multivalued, not required */
>  query="select SSC_VALUE from SSC_VALUE
>   where SSC_ATTRIBUTE_ID=1
> and SSC_VALUE in (${prog.vip})">
>   
>   
> 
> 
> The root entity is "prog" and it has an optional multivalued field
> called "vip". When the list of "vip" values is empty, the SQL for the
> sub-entity above throws an SQLException. (Working with Oracle which does
> not allow an empty expression in the "in"-clause.)
> 
> Two things:
> (A) best would be not to run the query whenever ${prog.vip} is null or
> empty.
> (B) From the documentation, it is not clear that onError is only checked
> in the transformer runs but not checked when the SQL for the entity
> throws an exception. (Trunk version JdbcDataSource lines 250pp).
> 
> IMHO, (A) is the better fix, and if so, (B) is the right decision. (If
> (A) is not easily fixable, making (B) work would be helpful.)
> 
> Looking through the code, I've realized that the replacement of the
> variables is done in a very generic way. I've not yet seen an
> appropriate way to check on those variables in order to stop the
> processing of the entity if the variable is empty.
> Is there a way to do this? Or maybe there is a completely different way
> to get my use case working. Any help most appreciated!
> 
> Thanks,
> Chantal
> 
> 
> 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-SQL-query-sub-entity-is-executed-although-variable-is-not-set-null-or-empty-list-tp995983p996446.html
Sent from the Solr - User mailing list archive at Nabble.com.


2 type of docs in same schema?

2010-07-26 Thread scrapy

 

 I need you expertise on this one...

We would like to index every search query that is passed in our solr engine 
(same core)

Our docs format are like this (already in our schema):
title
content
price
category
etc...

Now how to add "search queries" as a field in our schema? Know that the search 
queries won't have all the field above?
For example: 
q=bmw car
q=car wheels
q=moto honda
etc...

Should we run an other core that only index search queries? or is there a way 
to do this with same instance and same core?

Thanks for your help




Re: Solr Doc Lucene Doc !?

2010-07-26 Thread kenf_nc

DataImportHandler (DIH) is an add-on to Solr. It lets you import documents
from a number of sources in a flexible way. The only connection DIH has to
Lucene is that Solr uses Lucene as the index engine.

When you work with Solr you naturally talk about Solr Documents, if you were
working with Lucene natively (without Solr) you would talk about Lucene
documents, but they are basically the same thing. 

Are you having a specific issue? Or are you just trying to learn more about
the technology?

If you are mostly trying to understand DIH, then you should think in terms
of Solr and Solr documents. Understand that Lucene is working behind the
scenes, but you really don't need to worry about that all that often.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p996425.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with parsing date

2010-07-26 Thread Rafal Bluszcz Zawadzki
I have just fixed it.

Problem was related with operating system value - they were different that
solr expected with incoming datastream.

Regards,

Rafal Zawadzki

On Mon, Jul 26, 2010 at 3:20 PM, Chantal Ackermann <
chantal.ackerm...@btelligent.de> wrote:

> On Mon, 2010-07-26 at 14:46 +0200, Rafal Bluszcz Zawadzki wrote:
> > EEE, d MMM  HH:mm:ss z
>
> not sure but you might want to try with an uppercase 'Z' for the
> timezone (surrounded by single quotes, alternatively). The rest of your
> pattern looks fine. But if you still run into problems try different
> versions, like putting the comma in quotes etc.
>
> Cheers,
> Chantal
>
>
>


Re: slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

I just checked my config file, and I do have exact same values for
deletionPolicy tag, as you attached in your email, so I dont really think it
could be this. 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996373.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: AW: slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

Yes I always run an optimize whenever I index on master. In fact I just ran
an optimize command an hour ago, but it didn't make any difference.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996364.html
Sent from the Solr - User mailing list archive at Nabble.com.


spell checking....

2010-07-26 Thread satya swaroop
hi all,
i am a new one to solr and able to implement indexing the documents
by following the solr wiki. now i am trying to add the spellchecking. i
followed the spellcheck component in wiki but not getting the suggested
spellings. i first build it by spellcheck.build=true,...

here i give u the example:::

http://localhost:8080/solr/spell?q=javs&spellcheck=true&spellcheck.collate=true



-








here the response should actualy suggest the "java" but didnt..

can any one guide me about it...
 i am using solr 1.4, tomcat in ubuntu





Regards,
swarup


AW: slave index is bigger than master index

2010-07-26 Thread Bastian Spitzer
Hi,

are u calling  on the master to finally remove deleted documents and 
merge the index files?
once a day is recommended:

http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations


cheers

-Ursprüngliche Nachricht-
Von: Muneeb Ali [mailto:muneeba...@hotmail.com] 
Gesendet: Montag, 26. Juli 2010 15:37
An: solr-user@lucene.apache.org
Betreff: slave index is bigger than master index


Hi,

I am using Solr 1.4 version, with master-slave setup. We have one master slave 
and two slave servers. It was all working fine, but lately solr slaves are 
behaving strange. Particularly during replicating the index, the slave nodes 
die and always need a restart. Also the index size of slave nodes is much 
bigger (336GB) than the master node index (i.e. only 86GB).

I am guessing that its not removing previous indices at slave nodes when 
replicating? Has anyone faced similar issues? 

Any help would be highly appreciated.

Thanks very much.

-Muneeb
--
View this message in context: 
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996329.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: slave index is bigger than master index

2010-07-26 Thread Tommaso Teofili
Hi,
I think that you may be using a Lucene/Solr IndexDeletionPolicy that does
not remove old commits (and you aren't propagating solr-config via
replication).
You can configre this feature on the solr-config.xml inside the
 tag:

*
  
  1
  
  0
  
*

I hope this can be helpful.
Cheers,
Tommaso

2010/7/26 Muneeb Ali 

>
> Hi,
>
> I am using Solr 1.4 version, with master-slave setup. We have one master
> slave and two slave servers. It was all working fine, but lately solr
> slaves
> are behaving strange. Particularly during replicating the index, the slave
> nodes die and always need a restart. Also the index size of slave nodes is
> much bigger (336GB) than the master node index (i.e. only 86GB).
>
> I am guessing that its not removing previous indices at slave nodes when
> replicating? Has anyone faced similar issues?
>
> Any help would be highly appreciated.
>
> Thanks very much.
>
> -Muneeb
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996329.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

Hi,

I am using Solr 1.4 version, with master-slave setup. We have one master
slave and two slave servers. It was all working fine, but lately solr slaves
are behaving strange. Particularly during replicating the index, the slave
nodes die and always need a restart. Also the index size of slave nodes is
much bigger (336GB) than the master node index (i.e. only 86GB).

I am guessing that its not removing previous indices at slave nodes when
replicating? Has anyone faced similar issues? 

Any help would be highly appreciated.

Thanks very much.

-Muneeb
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996329.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with parsing date

2010-07-26 Thread Chantal Ackermann
On Mon, 2010-07-26 at 14:46 +0200, Rafal Bluszcz Zawadzki wrote:
> EEE, d MMM  HH:mm:ss z

not sure but you might want to try with an uppercase 'Z' for the
timezone (surrounded by single quotes, alternatively). The rest of your
pattern looks fine. But if you still run into problems try different
versions, like putting the comma in quotes etc.

Cheers,
Chantal




Similar search regarding a result document

2010-07-26 Thread scrapy

 Hi,

I would like to implement a similar search feature... but not relative to the 
initial search query but relative to each resuts documents.

The structure of each doc is:
id
title
content
price
etc...

Then we have a database of global search seach queries, i'm thinking to 
integrate this in solr.

I'm planing to implement this as a query of a query... but before i would like 
to know if there is a built in function in Solr for this?

Thanks for your help.






 




Re: 2 solr dataImport requests on a single core at the same time

2010-07-26 Thread kishan

btw , i want to put all the requestHandlers(more than 1)  in 1 xml file and i
want to use this in my solrConfig.xml

i have used xinclude but it didnt work ..

please suggest me any thing 


Thanks,
Prasad
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p996248.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: 2 solr dataImport requests on a single core at the same time

2010-07-26 Thread kishan

Tq very Much ..


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p996190.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with parsing date

2010-07-26 Thread Rafal Bluszcz Zawadzki
I am using also others dateFormat string, also in same data handler and they
works. But not this one.

And this data are fetching from the external source, so I don't have
possibility to modify them (well, theoritacly i can save them, edit etc but
this is not the way). Why this is not working with SOLR?



On Mon, Jul 26, 2010 at 2:37 PM, Li Li  wrote:

> I uses format like -MM-ddThh:mm:ssZ. it works
>
> 2010/7/26 Rafal Bluszcz Zawadzki :
> > Hi,
> >
> > I am using Data Import Handler from Solr 1.4.
> >
> > Parts of my data-config.xml are:
> >
> >
> > >processor="XPathEntityProcessor"
> >stream="false"
> >forEach="/multistatus/response"
> >url="/tmp/file.xml"
> >
> >  transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"
> >>
> > .
> >
> > >  xpath="/multistatus/response/propstat/prop/getlastmodified"
> > dateTimeFormat="EEE, d MMM  HH:mm:ss z" />
> > >  xpath="/multistatus/response/propstat/prop/creationdate"
> > dateTimeFormat="-MM-dd'T'hh:mm:ss'Z'"/>
> >
> > During full-import I got message:
> >
> > WARNING: Error creating document :
> > SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
> > parentPaths=parentPaths(1.0)={/site},
> > review_state=review_state(1.0)={published}, created=created(1.0)={Sat Oct
> 11
> > 14:38:27 CEST 2003}, UID=UID(1.0)={http://www.example.com:80/File-1563},
> > Title=Title(1.0)={This is only an example document},
> > portal_type=portal_type(1.0)={Document}, modified=modified(1.0)={Wed, 15
> Jul
> > 2009 08:23:34 GMT}}]
> > org.apache.solr.common.SolrException: Invalid Date String:'Wed, 15 Jul
> 2009
> > 08:23:34 GMT'
> > at org.apache.solr.schema.DateField.parseMath(DateField.java:163)
> > at
> org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171)
> > at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
> > at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246)
> > at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
> >
> > Which as I understand, means that Solr / Java coudnt parse my date.
> >
> > In my xml file it looks like:
> > Wed, 15 Jul 2009 08:23:34 GMT
> >
> > In my opinion format "EEE, d MMM  HH:mm:ss z" is correct, and what
> more
> > important - it was suppouse to work with same data week ago :)
> >
> > Any idea will be appreciate.
> >
> > --
> > Rafal Zawadzki
> > Backend developer
> >
>


Re: Problem with Pdf, Sol 1.4.1 Cell

2010-07-26 Thread Tommaso Teofili
Hi,
I think there is an open bug for it at:
https://issues.apache.org/jira/browse/SOLR-1902
Using Solr 1.4.1 and upgrading Tika libraries to 0.8 snapshot I had also to
upgrade pdfbox, fontbox and jembox to 1.2.1; I got no errors and it seems
it's able to index PDFs without any errors (I can query them by id:doc1 for
example) but did not extract text or other metadata from them.
Building a new Solr distribution from trunk (ant distr) and using Tika 0.8
snapshot (with pdfbox, fontbox and jebox 1.2.1) it seems it's working.
My 2 cents,
Tommaso

2010/7/23 Alessandro Benedetti 

> Hi all,
> as I saw in this discussion [1] there were many issues with PDF indexing in
> Solr 1.4  due to TIka library (0.4 Version).
> In Solr 1.4.1 the tika library is the same so I guess  the issues are the
> same.
> Could anyone, who contributed to the previous thread, help me in resolving
> these issues?
> I need a simple tutorial that could help me to upgrade Solr Cell!
>
> Something like this:
> 1) download tika core from trunk
> 2)create jar with maven dependecies
> 3)unjar Sol 1.4.1 and change tika library
> 4)jar the patched Solr 1.4.1 and enjoy!
>
> [1]
>
> http://markmail.org/message/zbkplnzqho7mxwy3#query:+page:1+mid:gamcxdx34ayt6ccg+state:results
>
> Best regards
>
> --
> --
>
> Benedetti Alessandro
> Personal Page: http://tigerbolt.altervista.org
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>


Re: Problem with parsing date

2010-07-26 Thread Li Li
I uses format like -MM-ddThh:mm:ssZ. it works

2010/7/26 Rafal Bluszcz Zawadzki :
> Hi,
>
> I am using Data Import Handler from Solr 1.4.
>
> Parts of my data-config.xml are:
>
>
>                        processor="XPathEntityProcessor"
>                stream="false"
>                forEach="/multistatus/response"
>                url="/tmp/file.xml"
>
>  transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"
>                >
> .
>
>              xpath="/multistatus/response/propstat/prop/getlastmodified"
> dateTimeFormat="EEE, d MMM  HH:mm:ss z" />
>              xpath="/multistatus/response/propstat/prop/creationdate"
> dateTimeFormat="-MM-dd'T'hh:mm:ss'Z'"/>
>
> During full-import I got message:
>
> WARNING: Error creating document :
> SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
> parentPaths=parentPaths(1.0)={/site},
> review_state=review_state(1.0)={published}, created=created(1.0)={Sat Oct 11
> 14:38:27 CEST 2003}, UID=UID(1.0)={http://www.example.com:80/File-1563},
> Title=Title(1.0)={This is only an example document},
> portal_type=portal_type(1.0)={Document}, modified=modified(1.0)={Wed, 15 Jul
> 2009 08:23:34 GMT}}]
> org.apache.solr.common.SolrException: Invalid Date String:'Wed, 15 Jul 2009
> 08:23:34 GMT'
> at org.apache.solr.schema.DateField.parseMath(DateField.java:163)
> at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171)
> at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
> at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>
> Which as I understand, means that Solr / Java coudnt parse my date.
>
> In my xml file it looks like:
> Wed, 15 Jul 2009 08:23:34 GMT
>
> In my opinion format "EEE, d MMM  HH:mm:ss z" is correct, and what more
> important - it was suppouse to work with same data week ago :)
>
> Any idea will be appreciate.
>
> --
> Rafal Zawadzki
> Backend developer
>


Problem with parsing date

2010-07-26 Thread Rafal Bluszcz Zawadzki
Hi,

I am using Data Import Handler from Solr 1.4.

Parts of my data-config.xml are:



.




During full-import I got message:

WARNING: Error creating document :
SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
parentPaths=parentPaths(1.0)={/site},
review_state=review_state(1.0)={published}, created=created(1.0)={Sat Oct 11
14:38:27 CEST 2003}, UID=UID(1.0)={http://www.example.com:80/File-1563},
Title=Title(1.0)={This is only an example document},
portal_type=portal_type(1.0)={Document}, modified=modified(1.0)={Wed, 15 Jul
2009 08:23:34 GMT}}]
org.apache.solr.common.SolrException: Invalid Date String:'Wed, 15 Jul 2009
08:23:34 GMT'
at org.apache.solr.schema.DateField.parseMath(DateField.java:163)
at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171)
at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)

Which as I understand, means that Solr / Java coudnt parse my date.

In my xml file it looks like:
Wed, 15 Jul 2009 08:23:34 GMT

In my opinion format "EEE, d MMM  HH:mm:ss z" is correct, and what more
important - it was suppouse to work with same data week ago :)

Any idea will be appreciate.

-- 
Rafal Zawadzki
Backend developer


Can't find org.apache.solr.client.solrj.embedded

2010-07-26 Thread Uwe Reh

Hello experts,

where is a Jar, containing org.apache.solr.client.solrj.embedded?

I miss this package in 'apache-solr-solrj-1.4.[01].jar'.
Also I can't find any other sources than 
>http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src/org/apache/solr/client/solrj/embedded/ , which does not fit to Solr 1.4.


Any tips for a blind newbie?

Uwe




Re: Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

... but in the code is the talk about of, SolrDocuments. these are higher
level docs, used to construct the lucene doc to index ... !!?!?!?!? 

and in wiki is the talk about "Build Solr documents by aggregating data from
multiple columns and tables according to configuration"
http://wiki.apache.org/solr/DataImportHandler?highlight=(dih)

so its a little bit confused.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p996005.html
Sent from the Solr - User mailing list archive at Nabble.com.


DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-26 Thread Chantal Ackermann
Hi,

my use case is the following:

In a sub-entity I request rows from a database for an input list of
strings:

 /* multivalued, not required */





The root entity is "prog" and it has an optional multivalued field
called "vip". When the list of "vip" values is empty, the SQL for the
sub-entity above throws an SQLException. (Working with Oracle which does
not allow an empty expression in the "in"-clause.)

Two things:
(A) best would be not to run the query whenever ${prog.vip} is null or
empty.
(B) From the documentation, it is not clear that onError is only checked
in the transformer runs but not checked when the SQL for the entity
throws an exception. (Trunk version JdbcDataSource lines 250pp).

IMHO, (A) is the better fix, and if so, (B) is the right decision. (If
(A) is not easily fixable, making (B) work would be helpful.)

Looking through the code, I've realized that the replacement of the
variables is done in a very generic way. I've not yet seen an
appropriate way to check on those variables in order to stop the
processing of the entity if the variable is empty.
Is there a way to do this? Or maybe there is a completely different way
to get my use case working. Any help most appreciated!

Thanks,
Chantal



Re: Solr Doc Lucene Doc !?

2010-07-26 Thread MitchK

Stockii,

Solr's index is a Lucene Index. Therefore, Solr documents are Lucene
documents.

Kind regards,
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p995968.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

Hello. 

I write a little text about SOLR and LUCENE by using the DIH. 

what documents are creating and inserting DIH ? in wiki is the talk about
"solr documents" but i thought that, solr uses lucene to do this and so that
DIH creates Lucnee Documents, not Solr Documents !? 

what are doing the DIH exactly ? 

how can i easy find out that !? 

or how is intern manage the documents in solr ? is there a different about
solr and lucene doc`s ? 


can anyone give me a little overview how DIH works ? that would be great ;-) 

thx stockiii
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p995922.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: schema.xml

2010-07-26 Thread Grijesh.singh

Hi ,
There is no required fields except u specify any fields to required.U can
remove or add as many fields u want.

That is an example schema which shows how feilds are configured
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/schema-xml-tp995696p995800.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to Protect data

2010-07-26 Thread Peter Karich
Hi Girish,

I am not aware of such a thing.
But you could use a middleware to avoid certain fields from being
retrieved via the 'fl' parameter:
http://wiki.apache.org/solr/CommonQueryParameters#fl

E.g. for your customers the query looks like q=hello&fl=title and for
your admin the query looks like q=hello&fl=title,securedField
( Instead of a fullblown middleware you could try
http://wiki.apache.org/solr/VelocityResponseWriter )

Another option is to store the data encrypted in a field which could be
retrieved AND additionally store it cleartext in a second field which is
only searchable but will not returned.

Regards,
Peter.

> Hi,
>
> I was being ask about protecting data, means that the search index
> data is stored in the some indexed files and when you open those
> indexed files, I can clearly see the data, means some texts, e.g.
> name, address, postal code etc.
>
> is there anyway I can hide the data? means some kind of data encoding
> to not even see any text raw data.
>
> -Girish


Re: help with a schema design problem

2010-07-26 Thread Chantal Ackermann
Hi,

I haven't read everything thoroughly but have you considered creating
fields for each of your (I think what you call) "party value"?

So that you can query like "client:Pramod".
You would then be able to facet on client and supplier.

Cheers,
Chantal



On Fri, 2010-07-23 at 23:23 +0200, Geert-Jan Brits wrote:
> Multiple rows in the OPs example are combined to form 1 solr-document (e.g:
> row 1 and 2 both have documentid=1)
> Because of this combine, it would match p_value from row1 with p_type from
> row2 (or vice versa)
> 
> 
> 2010/7/23 Nagelberg, Kallin 
> 
> > > > > When i search
> > > > > p_value:"Pramod" AND p_type:"Supplier"
> > > > >
> > > > > it would give me result as document 1. Which is incorrect, since in
> > > > > document
> > > > > 1 Pramod is a Client and not a Supplier.
> >
> > Would it? I would expect it to give you nothing.
> >
> > -Kal
> >
> >
> >
> > -Original Message-
> > From: Geert-Jan Brits [mailto:gbr...@gmail.com]
> > Sent: Friday, July 23, 2010 5:05 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: help with a schema design problem
> >
> > > Is there any way in solr to say p_value[someIndex]="pramod"
> > And p_type[someIndex]="client".
> > No, I'm 99% sure there is not.
> >
> > > One way would be to define a single field in the schema as p_value_type =
> > "client pramod" i.e. combine the value from both the field and store it in
> > a
> > single field.
> > yep, for the use-case you mentioned that would definitely work. Multivalued
> > of course, so it can contain "Supplier Raj" as well.
> >
> >
> > 2010/7/23 Pramod Goyal 
> >
> > >In my case the document id is the unique key( each row is not a unique
> > > document ) . So a single document has multiple Party Value and Party
> > Type.
> > > Hence i need to define both Party value and Party type as mutli-valued.
> > Is
> > > there any way in solr to say p_value[someIndex]="pramod" And
> > > p_type[someIndex]="client".
> > >Is there any other way i can design my schema ? I have some solutions
> > > but none seems to be a good solution. One way would be to define a single
> > > field in the schema as p_value_type = "client pramod" i.e. combine the
> > > value
> > > from both the field and store it in a single field.
> > >
> > >
> > > On Sat, Jul 24, 2010 at 12:18 AM, Geert-Jan Brits 
> > > wrote:
> > >
> > > > With the usecase you specified it should work to just index each "Row"
> > as
> > > > you described in your initial post to be a seperate document.
> > > > This way p_value and p_type all get singlevalued and you get a correct
> > > > combination of p_value and p_type.
> > > >
> > > > However, this may not go so well with other use-cases you have in mind,
> > > > e.g.: requiring that no multiple results are returned with the same
> > > > document
> > > > id.
> > > >
> > > >
> > > >
> > > > 2010/7/23 Pramod Goyal 
> > > >
> > > > > I want to do that. But if i understand correctly in solr it would
> > store
> > > > the
> > > > > field like this:
> > > > >
> > > > > p_value: "Pramod"  "Raj"
> > > > > p_type:  "Client" "Supplier"
> > > > >
> > > > > When i search
> > > > > p_value:"Pramod" AND p_type:"Supplier"
> > > > >
> > > > > it would give me result as document 1. Which is incorrect, since in
> > > > > document
> > > > > 1 Pramod is a Client and not a Supplier.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Jul 23, 2010 at 11:52 PM, Nagelberg, Kallin <
> > > > > knagelb...@globeandmail.com> wrote:
> > > > >
> > > > > > I think you just want something like:
> > > > > >
> > > > > > p_value:"Pramod" AND p_type:"Supplier"
> > > > > >
> > > > > > no?
> > > > > > -Kallin Nagelberg
> > > > > >
> > > > > > -Original Message-
> > > > > > From: Pramod Goyal [mailto:pramod.go...@gmail.com]
> > > > > > Sent: Friday, July 23, 2010 2:17 PM
> > > > > > To: solr-user@lucene.apache.org
> > > > > > Subject: help with a schema design problem
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Lets say i have table with 3 columns document id Party Value and
> > > Party
> > > > > > Type.
> > > > > > In this table i have 3 rows. 1st row Document id: 1 Party Value:
> > > Pramod
> > > > > > Party Type: Client. 2nd row: Document id: 1 Party Value: Raj Party
> > > > Type:
> > > > > > Supplier. 3rd row Document id:2 Party Value: Pramod Party Type:
> > > > Supplier.
> > > > > > Now in this table if i use SQL its easy for me find all document
> > with
> > > > > Party
> > > > > > Value as Pramod and Party Type as Client.
> > > > > >
> > > > > > I need to design solr schema so that i can do the same in Solr. If
> > i
> > > > > create
> > > > > > 2 fields in solr schema Party value and Party type both of them
> > multi
> > > > > > valued
> > > > > > and try to query +Pramod +Supplier then solr will return me the
> > first
> > > > > > document, even though in the first document Pramod is a client and
> > > not
> > > > a
> > > > > > supplier
> > > > > > Thanks,
> > > > > > Pramod Goyal
> > > > > >
> > > > >
> > 

Integration Problem

2010-07-26 Thread Jörg Wißmeier
Hi everybody,

since a while i'm working with solr and i have integrated it with
liferay 6.0.3. So every search request from liferay is processed by solr
and its index.
But i have to integrate another system, this system offers me a
webservice. the results of these webservice should be in the results of
solr but not in the index of it.
I tried to do that with a custom query handler and a custom response
writer and i'm able to write in the response msg of solr but only in the
response node of the xml msg an not in the results node.
So is there any solution how i could write in the results node of the
xml msg from solr?

thanks in advance

best regards
joerg