Re: solr result....

2010-10-28 Thread satya swaroop
Hi Lance,
  I actually copied tika exceptions in one html file and indexed
it. It is just a content of a file and here i tell u  what i mean::


if i post a query like *java* then the result or response from solr should
hit only a part of the content like as follows::

http://localhost:8456/solr/select/?q=java&version=2.2&start=10&rows=10&indent=on

-
-
0
453

-
-
-
application/pdf

javaebuk
2001-07-02T11:54:10Z
-
-

A Java program with two main methods  The following is an example of a java
program with two main methods with different signatures.
Program 3
public class TwoMains
{
/** This class has two main methods with
* different signatures */
public static void main (String args[])  .
  
 
.






the doc in the result should not contain the entire content of a file. It
should have only a part of the content.The content should be the first hit
of the word java in that file...


Regards,
satya


Re: If I want to move a core from one physical machine to another....

2010-10-28 Thread Gora Mohanty
On Thu, Oct 28, 2010 at 3:42 AM, Ron Mayer  wrote:
> If I want to move a core from one physical machine to another,
> is it as simple as just
>   scp -r core5 otherserver:/path/on/other/server/
> and then adding
>    
> on that other server's solr.xml file and restarting the server there?

If "core5" is the local Solr data directory for that core,
yes that should work. Of course, you will also have to ensure
that the schema, and other configuration details, e.g., in
solrconfig.xml, match.

> PS: Should have I been able to figure the answer to that
>    out by RTFM somewhere?

Maybe not specifically noted, but copying Solr's index by
copying the data directory to a new machine was always
possible, as far as I know.

Regards,
Gora


Re: Searching with wrong keyboard layout or using translit

2010-10-28 Thread Alexander Kanarsky
Pavel,

I think there is no single way to implement this. Some ideas that
might be helpful:

1. Consider adding additional terms while indexing. This assumes
conversion of Russian text to both "translit" and "wrong keyboard"
forms and index converted terms along with original terms (i.e. your
Analyzer/Filter should produce Moskva and Vjcrdf for term Москва). You
may re-use the same field (if you plan for a simple term queries) or
create a separate fields for the generated terms (better for phrase,
proximity queries etc. since it keeps the original text positional
info). Then the query could use any of these forms to fetch the
document. If you use separate fields, you'll need to expand/create
your query to search for them, of course.
2. If you have to index just an original Russian text, you might
generate all term forms while analyzing the query, then you could
treat the converted terms as a synonyms and use the combination of
TermQuery for all term forms or the MultiPhraseQuery for the phrases.
For Solr in this case you probably will need to add a custom filter
similar to SynonymFilter.

Hope this helps,
-Alexander

On Wed, Oct 27, 2010 at 1:31 PM, Pavel Minchenkov  wrote:
> Hi,
>
> When I'm trying to search Google with wrong keyboard layout -- it corrects
> my query, example: http://www.google.ru/search?q=vjcrdf (I typed word
> "Moscow" in Russian but in English keyboard layout).
> Also, when I'm searching using
> translit, It does the same: http://www.google.ru/search?q=moskva
>
> What is the right way to implement this feature in Solr?
>
> --
> Pavel Minchenkov
>


Re: question about SolrCore

2010-10-28 Thread Li Li
is there anyone could help me?

2010/10/11 Li Li :
> hi all,
>    I want to know the detail of IndexReader in SolrCore. I read a
> little codes of SolrCore. Here is my understanding, are they correct?
>    Each SolrCore has many SolrIndexSearcher and keeps them in
> _searchers. and _searcher keep trace of the latest version of index.
> Each SolrIndexSearcher has a SolrIndexReader. If there isn't any
> update, all these searchers share one single SolrIndexReader. If there
> is an update, then a newSearcher will be created and a new
> SolrIndexReader associated with it.
>    I did a simple test.
>    A thread do a query and blocked by breakpoint. Then I feed some
> data to update index. When commit, a newSearcher is created.
>    Here is the debug info:
>
>    SolrCore _searcher [solrindexsearc...@...ab]
>
> _searchers[solrindexsearc...@...77,solrindexsearc...@...ab,solrindexsearc...@..f8]
>                 solrindexsearc...@...77 's SolrIndexReader is old one
> and     ab and f8 share the same newest SolrIndexReader
>    When query finished solrindexsearc...@...77 is discarded. When
> newSearcher success to warmup, There is only one SolrIndexSearcher.
>    The SolrIndexReader of old version of index is discarded and only
> segments in newest SolrIndexReader are referenced. Those segments not
> in new version can then be deleted because no file pointer reference
> them
> .
>    Then I start 3 queries. There is only one SolrIndexSearcher but RefCount=4.
>    It seems many search can share one single SolrIndexSearcher.
>    So in which situation, there will exist more than one
> SolrIndexSearcher that they share just one SolrIndexReader?
>    Another question, for each version of index, is there just one
> SolrIndexReader instance associated with it? will it occur that more
> than one SolrIndexReader are opened and they are the same version of
> index?
>


Re: Stored or indexed?

2010-10-28 Thread Savvas-Andreas Moysidis
In our case, we just store a database id and do a secondary db query when
displaying the results.
This is handy and leads to a more centralised architecture when you need to
display properties of a domain object which you don't index/search.

On 28 October 2010 05:02, kenf_nc  wrote:

>
> Interesting wiki link, I hadn't seen that table before.
>
> And to answer your specific question about indexed=true, stored=false, this
> is most often done when you are using analyzers/tokenizers on your field.
> This field is for search only, you would never retrieve it's contents for
> display. It may in fact be an amalgam of several fields into one 'content'
> field. You have your display copy stored in another field marked
> indexed=false, stored=true and optionally compressed. I also have simple
> string fields set to lowercase so searching is case-insensitive, and have a
> duplicate field where the string is normal case. the first one is
> indexed/not stored, the second is stored/not indexed.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Inconsistent slave performance after optimize

2010-10-28 Thread Mason Hale
On Wed, Oct 27, 2010 at 8:59 PM, Jonathan Rochkind  wrote:

> Seriously, at least try JVM argument -XX:+UseConcMarkSweepGC .  That
> argument took care of very similar symptoms I was having.  I never did
> figure out exactly what was causing them, but at some point I tried that JVM
> argument, and they went away never to come back (which I guess is a clue
> about what was causing the slowdown, but the JVM still confuses me).
>

Will do. Thanks for the tip.

Mason



> 
> From: Mason Hale [masonh...@gmail.com]
> Sent: Wednesday, October 27, 2010 9:03 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Inconsistent slave performance after optimize
>
> On Wed, Oct 27, 2010 at 7:18 PM, Ken Krugler  >wrote:
>
> > Normally I'd say like you were getting into swap hell, but based on your
> > settings you only have 5GB of JVM space being used, on a 16GB box.
> >
> > Just to confirm, nothing else is using lots of memory, right? And the
> "top"
> > command isn't showing any swap usage, right?
> >
> >
> Correct. Only thing of note running on this machine is Solr.
>
> I don't have a poor performing server on hand at the moment, but I recall
> checking top when it was tanking, and it was not showing any swap usage.
>
>
> > When you encounter very slow search times, what does the top command say
> > about system load and cpu vs. I/O percentages?
> >
>
> I did look at iostat -x when the server was running slow and IO util was
> 100%.
>
> This lead me to believe the problem was cache-warming related, and that
> data
> needed to be loaded into Solr caches and/or files loaded into the file
> system cache.
>
> Does that yield any additional clues.
>
> In this does happen again, what stats should I collect?
>
> (Note to self: need to install sar on these servers to collect historical
> performance data...)
>
> Mason
>


No response from Solr on complex request after several days

2010-10-28 Thread Xavier Schepler

Hi,

We are in a beta testing phase, with several users a day.

After several days of waiting, the solr server didn't respond to 
requests that require a lot of processing time.


I'm using Solr inside Tomcat.

This is the request that had no response from the server :

wt=json&omitHeader=true&q=qiAndMSwFR%3A%28transport%29&q.op=AND&start=0&rows=5&fl=id,domainId,solrLangCode,ddiFileId,studyDescriptionId,studyYearAndDescriptionId,nesstarServerId,studyNesstarId,variableId,questionId,variableNesstarId,concept,studyTitle,studyQuestionCount,hasMultipleItems,variableName,hasQuestionnaire,questionnaireUrl,studyDescriptionUrl,universe,notes,preQuestionText,postQuestionText,interviewerInstructions,questionPosition,vlFR,qFR,iFR,mFR,vlEN,qEN,iEN,mEN,&sort=score%20desc&fq=solrLangCode%3AFR&facet=true&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DdomainId&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DstudyDecade&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DstudySerieId&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DstudyYearAndDescriptionId&facet.sort=count&f.studyDecade.facet.sort=lex&spellcheck=true&spellcheck.count=10&spellcheck.dictionary=qiAndMFR&spellcheck.q=transport&hl=on&hl.fl=qSwFR,iHLSwFR,mHLSwFR&hl.fragsize=0&hl.snippets=1&hl.usePhraseHighlighter=true&hl.highlightMultiTerm=true&hl.simple.pre=%3Cb%3E&hl.simple.post=%3C%2Fb%3E&hl.mergeContiguous=false 



It involves highlighting on a multivalued field with more than 600 short 
values inside. It takes 200 or 300 ms because of highlighting.


After restarting tomcat all went fine again.

I'm trying to understand why I had to restart tomcat and solr and what 
should I do to have it working 7/7 24/24.


Xavier




solr stuck in writing to inexisting sockets

2010-10-28 Thread Roxana Angheluta
Hi all,

We are using Solr over Jetty with a large index, sharded and distributed over 
multiple machines. Our queries are quite long, involving boolean and proximity 
operators. We cut the connection at the client side after 5 minutes. Also, we 
are using parameter timeAllowed to stop executing it on the server after a 
while.
We quite often run into situations when solr "blocks". The load on the server 
increases and a thread dump on the solr process shows many threads like below:


"btpool0-49" prio=10 tid=0x7f73afe1d000 nid=0x3581 runnable 
[0x451a]
   java.lang.Thread.State: RUNNABLE
at java.io.PrintWriter.write(PrintWriter.java:362)
at org.apache.solr.common.util.XML.escape(XML.java:206)
at org.apache.solr.common.util.XML.escapeCharData(XML.java:79)
at org.apache.solr.request.XMLWriter.writePrim(XMLWriter.java:832)
at org.apache.solr.request.XMLWriter.writeStr(XMLWriter.java:684)
at org.apache.solr.request.XMLWriter.writeVal(XMLWriter.java:564)
at org.apache.solr.request.XMLWriter.writeDoc(XMLWriter.java:435)
at org.apache.solr.request.XMLWriter$2.writeDocs(XMLWriter.java:514)
at org.apache.solr.request.XMLWriter.writeDocuments(XMLWriter.java:485)
at 
org.apache.solr.request.XMLWriter.writeSolrDocumentList(XMLWriter.java:494)
at org.apache.solr.request.XMLWriter.writeVal(XMLWriter.java:588)
at org.apache.solr.request.XMLWriter.writeResponse(XMLWriter.java:130)
at 
org.apache.solr.request.XMLResponseWriter.write(XMLResponseWriter.java:34)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:325)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
..


A netstat on the machine shows sockets in state CLOSE_WAIT. However, they are 
fewer than the number of RUNNABLE threads as the above.

Why is this happening? Is there anything we can do to avoid getting in these 
situations?

Thanks,
roxana


  


Re: Searching with wrong keyboard layout or using translit

2010-10-28 Thread Pavel Minchenkov
Alexander,

Thanks,
What variat has better performance?


2010/10/28 Alexander Kanarsky 

> Pavel,
>
> I think there is no single way to implement this. Some ideas that
> might be helpful:
>
> 1. Consider adding additional terms while indexing. This assumes
> conversion of Russian text to both "translit" and "wrong keyboard"
> forms and index converted terms along with original terms (i.e. your
> Analyzer/Filter should produce Moskva and Vjcrdf for term Москва). You
> may re-use the same field (if you plan for a simple term queries) or
> create a separate fields for the generated terms (better for phrase,
> proximity queries etc. since it keeps the original text positional
> info). Then the query could use any of these forms to fetch the
> document. If you use separate fields, you'll need to expand/create
> your query to search for them, of course.
> 2. If you have to index just an original Russian text, you might
> generate all term forms while analyzing the query, then you could
> treat the converted terms as a synonyms and use the combination of
> TermQuery for all term forms or the MultiPhraseQuery for the phrases.
> For Solr in this case you probably will need to add a custom filter
> similar to SynonymFilter.
>
> Hope this helps,
> -Alexander
>
> On Wed, Oct 27, 2010 at 1:31 PM, Pavel Minchenkov 
> wrote:
> > Hi,
> >
> > When I'm trying to search Google with wrong keyboard layout -- it
> corrects
> > my query, example: http://www.google.ru/search?q=vjcrdf (I typed word
> > "Moscow" in Russian but in English keyboard layout).
> > Also, when I'm searching using
> > translit, It does the same: http://www.google.ru/search?q=moskva
> >
> > What is the right way to implement this feature in Solr?
> >
> > --
> > Pavel Minchenkov
> >
>



-- 
Pavel Minchenkov


QueryElevation Component is so slow

2010-10-28 Thread Chamnap Chhorn
Hi,

I'm using solr 1.4 and using QueryElevation Component for guaranteed search
position. I have around 700,000 documents with 1 Mb elevation file. It turns
out it is quite slow on the newrelic monitoring website:

Slowest Components Count Exclusive Total   QueryElevationComponent 1 506,858
ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0% 506,862 ms
100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0 ms
0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%

As you could see, QueryElevationComponent takes quite a lot of time. Any
suggestion how to improve this?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-28 Thread Chamnap Chhorn
Sorry for very bad pasting. I paste it again.

Slowest Components  Count   Exclusive
 Total
QueryElevationComponent 1 506,858 ms 100%
506,858 ms 100%
SolrIndexSearcher 1 2.0 ms
 0% 2.0 ms 0%
org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms
 0% 506,862 ms 100%
QueryComponent 1 1.0 ms
 0%   1.0 ms 0%
DebugComponent 1 0.0 ms
 0% 0.0 ms 0%
FacetComponent 1 0.0 ms
 0% 0.0 ms 0%

On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn wrote:

> Hi,
>
> I'm using solr 1.4 and using QueryElevation Component for guaranteed search
> position. I have around 700,000 documents with 1 Mb elevation file. It turns
> out it is quite slow on the newrelic monitoring website:
>
> Slowest Components Count Exclusive Total   QueryElevationComponent 1
> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0% 506,862 ms
> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0 ms
> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
>
> As you could see, QueryElevationComponent takes quite a lot of time. Any
> suggestion how to improve this?
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Possible bug in query sorting

2010-10-28 Thread Pablo Recio
Hi all. I'm having a problem with solr sorting search results.

When I try to make a query and sort it by title:

http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on&sort=title%20desc

I get that error [1]. If I try to sort by other indexed field it works, indeed
if I change in solr schema title name to titlx, for example, it works.

It's a bug? Anyone has the same problem?

[1] HTTP ERROR: 500

501

java.lang.ArrayIndexOutOfBoundsException: 501
at 
org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721)
at 
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224)
at 
org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692)
at 
org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667)
at 
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:249)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

RequestURI=*/solr/select/*

Powered by Jetty://


No response from Solr on complex request (real issue explained)

2010-10-28 Thread Xavier Schepler

Hi,

We are in a beta testing phase, with several users a day.

After several days of running well, the solr server stopped responding 
to requests that require a lot of processing time, like this one :


wt=json&omitHeader=true&q=qiAndMSwFR%3A%28transport%29&q.op=AND&start=0&rows=5&fl=id,domainId,solrLangCode,ddiFileId,studyDescriptionId,studyYearAndDescriptionId,nesstarServerId,studyNesstarId,variableId,questionId,variableNesstarId,concept,studyTitle,studyQuestionCount,hasMultipleItems,variableName,hasQuestionnaire,questionnaireUrl,studyDescriptionUrl,universe,notes,preQuestionText,postQuestionText,interviewerInstructions,questionPosition,vlFR,qFR,iFR,mFR,vlEN,qEN,iEN,mEN,&sort=score%20desc&fq=solrLangCode%3AFR&facet=true&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DdomainId&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DstudyDecade&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DstudySerieId&facet.field=%7B%21ex%3DstudySerieIds%2Cdecades%2CstudyIds%2CqueryFilters%2CconceptIds%2CdomainIds%7DstudyYearAndDescriptionId&facet.sort=count&f.studyDecade.facet.sort=lex&spellcheck=true&spellcheck.count=10&spellcheck.dictionary=qiAndMFR&spellcheck.q=transport&hl=on&hl.fl=qSwFR,iHLSwFR,mHLSwFR&hl.fragsize=0&hl.snippets=1&hl.usePhraseHighlighter=true&hl.highlightMultiTerm=true&hl.simple.pre=%3Cb%3E&hl.simple.post=%3C%2Fb%3E&hl.mergeContiguous=false 



It involves highlighting on a multivalued field with more than 600 short 
values inside. Usually, it takes 200 or 300 ms.


I'm using Solr within Tomcat.
After restarting Tomcat all went fine again.

I'm trying to understand why I had to restart tomcat and what should I 
do to have it working 7/7 24/24.



Xavier



Re: Possible bug in query sorting

2010-10-28 Thread Michael McCandless
Is it somehow possible that you are trying to sort by a multi-valued field?

Mike

On Thu, Oct 28, 2010 at 6:59 AM, Pablo Recio  wrote:
> Hi all. I'm having a problem with solr sorting search results.
>
> When I try to make a query and sort it by title:
>
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on&sort=title%20desc
>
> I get that error [1]. If I try to sort by other indexed field it works, indeed
> if I change in solr schema title name to titlx, for example, it works.
>
> It's a bug? Anyone has the same problem?
>
> [1] HTTP ERROR: 500
>
> 501
>
> java.lang.ArrayIndexOutOfBoundsException: 501
>    at 
> org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721)
>    at 
> org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224)
>    at 
> org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692)
>    at 
> org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667)
>    at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94)
>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:249)
>    at org.apache.lucene.search.Searcher.search(Searcher.java:171)
>    at 
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
>    at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
>    at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
>    at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
>    at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>    at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>    at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>    at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>    at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>    at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>    at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>    at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>    at org.mortbay.jetty.Server.handle(Server.java:285)
>    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>    at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
>    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
>    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>    at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>    at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
>
> RequestURI=*/solr/select/*
>
> Powered by Jetty://
>


Re: If I want to move a core from one physical machine to another....

2010-10-28 Thread Ken Stanley
On Wed, Oct 27, 2010 at 6:12 PM, Ron Mayer  wrote:

> If I want to move a core from one physical machine to another,
> is it as simple as just
>   scp -r core5 otherserver:/path/on/other/server/
> and then adding
>
> on that other server's solr.xml file and restarting the server there?
>
>
>
> PS: Should have I been able to figure the answer to that
>out by RTFM somewhere?
>

Ron,

In our current environment I index all of our data on one machine, and to
save time with "replication", I use scp to copy the data directory over to
our other servers. On the server that I copy from, I don't turn SOLR off,
but on the servers that I copy to, I shutdown tomcat; remove the data
directory; mv the data directory I scp'd from the source; turn tomcat back
on. I do it this way (especially with mv, versus cp) because it is the
fastest way to get the data on the other servers. And, as Gora pointed out,
you need to make sure that your configuration files match (specifically the
schema.xml) the source.

- Ken


RE: If I want to move a core from one physical machine to another....

2010-10-28 Thread Ephraim Ofir
How is this better than replication?

Ephraim Ofir


-Original Message-
From: Ken Stanley [mailto:doh...@gmail.com] 
Sent: Thursday, October 28, 2010 1:59 PM
To: solr-user@lucene.apache.org
Subject: Re: If I want to move a core from one physical machine to another

On Wed, Oct 27, 2010 at 6:12 PM, Ron Mayer  wrote:

> If I want to move a core from one physical machine to another,
> is it as simple as just
>   scp -r core5 otherserver:/path/on/other/server/
> and then adding
>
> on that other server's solr.xml file and restarting the server there?
>
>
>
> PS: Should have I been able to figure the answer to that
>out by RTFM somewhere?
>

Ron,

In our current environment I index all of our data on one machine, and to
save time with "replication", I use scp to copy the data directory over to
our other servers. On the server that I copy from, I don't turn SOLR off,
but on the servers that I copy to, I shutdown tomcat; remove the data
directory; mv the data directory I scp'd from the source; turn tomcat back
on. I do it this way (especially with mv, versus cp) because it is the
fastest way to get the data on the other servers. And, as Gora pointed out,
you need to make sure that your configuration files match (specifically the
schema.xml) the source.

- Ken


Re: If I want to move a core from one physical machine to another....

2010-10-28 Thread Ken Stanley
On Thu, Oct 28, 2010 at 8:07 AM, Ephraim Ofir  wrote:

> How is this better than replication?
>
> Ephraim Ofir
>
>
It's not; for our needs here, we have not set up replication through SOLR.
We are working through OOM problems/performance tuning first, then "best
practices" second. I just wanted the OP to know that it can be done, and how
we do it. :)


spellcheck component does not work with request handler

2010-10-28 Thread abhayd

I am using SOLR 1.3

I wanted to add spellcheck component to to standard request handler it so
did this
  

 
   explicit
   
  spellcheck
   
 
  

but for some reason it does not return suggestion for misspelled words. For
instance iphole does not get a suggestion of iphone.
here is my query
http://localhost:10101/solr/core1/select?q=user_query:iphole&spellcheck=true&spellcheck.collate=true

At the same time when I added another request handler
 

  
  false
  
  false
  
  1


  spellcheck

  
it works fine and returns suggestion
here is my query
http://localhost:10101/solr/core1/spell?q=iphole&spellcheck=true&spellcheck.collate=true

Any thoughts why it is not working?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-component-does-not-work-with-request-handler-tp1786079p1786079.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to use polish stemmer - Stempel - in schema.xml?

2010-10-28 Thread Jakub Godawa
Hi!
There is a polish stemmer http://www.getopt.org/stempel/ and I have
problems connecting it with solr 1.4.1
Questions:

1. Where EXACTLY do I put "stemper-1.0.jar" file?
2. How do I register the file, so I can build a fieldType like:


  


3. Is that the right approach to make it work?

Thanks for verbose explanation,
Jakub.


Commit/Optimise question

2010-10-28 Thread Savvas-Andreas Moysidis
Hello,

We currently index our data through a SQL-DIH setup but due to our model
(and therefore sql query) becoming complex we need to index our data
programmatically. As we didn't have to deal with commit/optimise before, we
are now wondering whether there is an optimal approach to that. Is there a
batch size after which we should fire a commit or should we execute a commit
after indexing all of our data? What about optimise?

Our document corpus is > 4m documents and through DIH the resulting index is
around 1.5G

We have searched previous posts but couldn't find a definite answer. Any
input much appreciated!

Regards,
-- Savvas


RE: spellcheck component does not work with request handler

2010-10-28 Thread Dyer, James
In your "standard" Search Handler, you have the "last-components" array inside 
.  However, it should be outside as in the "/spell" Search 
Handler.  Try this:


 
 
  explicit
 
 
  spellcheck
 


James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: abhayd [mailto:ajdabhol...@hotmail.com] 
Sent: Thursday, October 28, 2010 7:48 AM
To: solr-user@lucene.apache.org
Subject: spellcheck component does not work with request handler


I am using SOLR 1.3

I wanted to add spellcheck component to to standard request handler it so
did this
  

 
   explicit
   
  spellcheck
   
 
  

but for some reason it does not return suggestion for misspelled words. For
instance iphole does not get a suggestion of iphone.
here is my query
http://localhost:10101/solr/core1/select?q=user_query:iphole&spellcheck=true&spellcheck.collate=true

At the same time when I added another request handler
 

  
  false
  
  false
  
  1


  spellcheck

  
it works fine and returns suggestion
here is my query
http://localhost:10101/solr/core1/spell?q=iphole&spellcheck=true&spellcheck.collate=true

Any thoughts why it is not working?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-component-does-not-work-with-request-handler-tp1786079p1786079.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: documentCache clarification

2010-10-28 Thread Jay Luker
On Wed, Oct 27, 2010 at 9:13 PM, Chris Hostetter
 wrote:
>
> : schema.) My evidence for this is the documentCache stats reported by
> : solr/admin. If I request "rows=10&fl=id" followed by
> : "rows=10&fl=id,title" I would expect to see the 2nd request result in
> : a 2nd insert to the cache, but instead I see that the 2nd request hits
> : the cache from the 1st request. "rows=10&fl=*" does the same thing.
>
> your evidence is correct, but your interpretation is incorrect.
>
> the objects in the documentCache are lucene Documents, which contain a
> List of Field refrences.  when enableLazyFieldLoading=true is set, and
> there is a documentCache Document fetched from the IndexReader only
> contains the Fields specified in the fl, and all other Fields are marked
> as "LOAD_LAZY".
>
> When there is a cache hit on that uniqueKey at a later date, the Fields
> allready loaded are used directly if requested, but the Fields marked
> LOAD_LAZY are (you guessed it) lazy loaded from the IndexReader and then
> the Document updates the refrence to the newly actualized fields (which
> are no longer marked LOAD_LAZY)
>
> So with different "fl" params, the same Document Object is continually
> used, but the Fields in that Document grow as the fields requested (using
> the "fl" param) change.

Great stuff. Makes sense. Thanks for the clarification, and if no one
objects I'll update the wiki with some of this info.

I'm still not clear on this statement from the wiki's description of
the documentCache: "(Note: This cache cannot be used as a source for
autowarming because document IDs will change when anything in the
index changes so they can't be used by a new searcher.)"

Can anyone elaborate a bit on that. I think I've read it at least 10
times and I'm still unable to draw a mental picture. I'm wondering if
the document IDs referred to are the ones I'm defining in my schema,
or are they the underlying lucene ids, i.e. the ones that, according
to the Lucene in Action book, are "relative within each segment"?


> : will *not* result in an insert to queryResultCache. I have tried
> : various increments--10, 100, 200, 500--and it seems the magic number
> : is somewhere between 200 (cache insert) and 500 (no insert). Can
> : someone explain this?
>
> In addition to the  config option already
> mentioned (which controls wether a DocList is cached based on it's size)
> there is also the  config option which may confuse
> your cache observations.  if the window size is "50" and you ask for
> start=0&rows=10 what actually gets cached is "0-50" (assuming there are
> more then 50 results) so a subsequent request for start=10&rows=10 will be
> a cache hit.

Just so I'm clear, does the queryResultCache operate in a similar
manner as the documentCache as to what is actually cached? In other
words, is it the caching of the docList object that is reported in the
cache statistics hits/inserts numbers? And that object would get
updated with a new set of ordered doc ids on subsequent, larger
requests. (I'm flailing a bit to articulate the question, I know). For
example, if my queryResultMaxDocsCached is set to 200 and I issue a
request with rows=500, then I won't get a docList object entry in the
queryResultCache. However, if I issue a request with rows=10, I will
get an insert, and then a later request for rows=500 would re-use and
update that original cached docList object. Right? And would it be
updated with the full list of 500 ordered doc ids or only 200?

Thanks,
--jay


Overriding Tika's field processing

2010-10-28 Thread Tod
I'm reading my document data from a CMS and indexing it using calls to 
curl.  The curl call includes 'stream.url' so Tika will also index the 
actual document pointed to by the CMS' stored url.  This works fine.


Presentation side I have a dropdown with the title of all the indexed 
documents such that when a user clicks one of them it opens in a new 
window.  Using js, I've been parsing the json returned from Solr to 
create the dropdown.  The problem is I can't get the titles sorted 
alphabetically.


If I use a facet.sort on the title field I get back ALL the sorted 
titles in the facet block, but that doesn't include the associated 
URL's.  A sorted query won't work because title is a multivalued field.


The one option I can think of is to make the title single valued so that 
I have a one to one relationship to the returned url.  To do that I'd 
need to be able to *not* index the Tika returned values.


If I read right, my understanding was that I could use 'literal.title' 
in the curl call to limit what would be included in the index from Tika. 
 That doesn't seem to be working as a test facet query returns more 
than I have in the CMS.


Am I understanding the 'literal.title' processing correctly?  Does 
anybody have experience/suggestions on how to handle this?



Thanks - Tod



Re: Looking for Developers

2010-10-28 Thread Ravi Gidwani
May I suggest a new mailing list like solr-jobs (if it does not exist) or
something for such emails ? I think it is also important for the solr
developers to get emails about job opportunities ? No ?

~Ravi.

On Tue, Oct 26, 2010 at 11:42 PM, Pradeep Singh  wrote:

> This is the second time he has sent this shit. Kill his subscription. Is it
> possible?
>
> On Tue, Oct 26, 2010 at 10:38 PM, Yuchen Wang  wrote:
>
> > UNSUBSCRIBE
> >
> > On Tue, Oct 26, 2010 at 10:15 PM, Igor Chudov  wrote:
> >
> > > UNSUBSCRIBE
> > >
> > > On Wed, Oct 27, 2010 at 12:14 AM, ST ST  wrote:
> > > > Looking for Developers Experienced in Solr/Lucene And/OR FAST Search
> > > Engines
> > > > from India (Pune)
> > > >
> > > > We are looking for off-shore India Based Developers who are
> proficient
> > in
> > > > Solr/Lucene and/or FAST search engine .
> > > > Developers in the cities of Pune/Bombay in India are preferred.
> > > Development
> > > > is for projects based in US for a reputed firm.
> > > >
> > > > If you are proficient in Solr/Lucene/FAST and have 5 years minimum
> > > industry
> > > > experience with atleast 3 years in Search Development,
> > > > please send me your resume.
> > > >
> > > > Thanks
> > > >
> > >
> >
>


Re: Looking for Developers

2010-10-28 Thread Stefan Moises
Well, I don't see a problem sending (serious) job offers to this list... 
as long as nobody spams


just my 2c
Stefan

Am 28.10.2010 19:57, schrieb Ravi Gidwani:

May I suggest a new mailing list like solr-jobs (if it does not exist) or
something for such emails ? I think it is also important for the solr
developers to get emails about job opportunities ? No ?

~Ravi.

On Tue, Oct 26, 2010 at 11:42 PM, Pradeep Singh  wrote:


This is the second time he has sent this shit. Kill his subscription. Is it
possible?

On Tue, Oct 26, 2010 at 10:38 PM, Yuchen Wang  wrote:


UNSUBSCRIBE

On Tue, Oct 26, 2010 at 10:15 PM, Igor Chudov  wrote:


UNSUBSCRIBE

On Wed, Oct 27, 2010 at 12:14 AM, ST ST  wrote:

Looking for Developers Experienced in Solr/Lucene And/OR FAST Search

Engines

from India (Pune)

We are looking for off-shore India Based Developers who are

proficient

in

Solr/Lucene and/or FAST search engine .
Developers in the cities of Pune/Bombay in India are preferred.

Development

is for projects based in US for a reputed firm.

If you are proficient in Solr/Lucene/FAST and have 5 years minimum

industry

experience with atleast 3 years in Search Development,
please send me your resume.

Thanks



--
***
Stefan Moises
Senior Softwareentwickler

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-25
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
***



Re: Keeping "qt" parameter in distributed search

2010-10-28 Thread Chris Hostetter

: Is there any way to preserve qt in a distributed search so this doesn't
: happen?  I am using Solr 1.4.1, but we are upgrading to 3.1-dev very soon.

I'm not very knowledgeable about how distributed searching deals with 
request handlers, url paths, and the qt param (i have no idea why the 
exact same handler isn't propograted to the remote shards by default -- i 
thought it was, but your email suggests that it isn't) but i have seen a 
"shards.qt" param mentioned on the list and in some tests.

It doesn't appear to be documented on the wiki anywhere, but you may want 
to search for it and see if it helps your situation.

(NOTE: posting your solrconfig.xml handler definitions and the example 
ULRs you are trying to load will help people understand the behavior you 
are seeing and give better assistence)


-Hoss


Re: Looking for Developers

2010-10-28 Thread Mark Miller
Right - historically it's been fine because it hasn't grown into a
problem issue. Hopefully it just stays that way.

- Mark

On 10/28/10 2:00 PM, Stefan Moises wrote:
> Well, I don't see a problem sending (serious) job offers to this list...
> as long as nobody spams
> 
> just my 2c
> Stefan
> 
> Am 28.10.2010 19:57, schrieb Ravi Gidwani:
>> May I suggest a new mailing list like solr-jobs (if it does not exist) or
>> something for such emails ? I think it is also important for the solr
>> developers to get emails about job opportunities ? No ?
>>
>> ~Ravi.
>>
>> On Tue, Oct 26, 2010 at 11:42 PM, Pradeep Singh 
>> wrote:
>>
>>> This is the second time he has sent this shit. Kill his subscription.
>>> Is it
>>> possible?
>>>
>>> On Tue, Oct 26, 2010 at 10:38 PM, Yuchen Wang  wrote:
>>>
 UNSUBSCRIBE

 On Tue, Oct 26, 2010 at 10:15 PM, Igor Chudov 
 wrote:

> UNSUBSCRIBE
>
> On Wed, Oct 27, 2010 at 12:14 AM, ST ST  wrote:
>> Looking for Developers Experienced in Solr/Lucene And/OR FAST Search
> Engines
>> from India (Pune)
>>
>> We are looking for off-shore India Based Developers who are
>>> proficient
 in
>> Solr/Lucene and/or FAST search engine .
>> Developers in the cities of Pune/Bombay in India are preferred.
> Development
>> is for projects based in US for a reputed firm.
>>
>> If you are proficient in Solr/Lucene/FAST and have 5 years minimum
> industry
>> experience with atleast 3 years in Search Development,
>> please send me your resume.
>>
>> Thanks
>>
> 



Re: Sorting and filtering on fluctuating multi-currency price data?

2010-10-28 Thread Chris Hostetter

: Another approach would be to use ExternalFileField and keep the price data,
: normalized to USD, outside of the index. Every time the currency rates
: changed, we would calculate new normalized prices for every document in the
: index.

...that is the approach i would normally suggest.

: Still another approach would be to do the currency conversion at IndexReader
: warmup time. We would index native price and currency code and create a
: normalized currency field on the fly. This would be somewhat like
: ExternalFileField in that it involved data from outside the index, but it
: wouldn't need to be scoped to the parent SolrIndexReader, but could be
: per-segment. Perhaps a custom poly-field could accomplish something like
: this?

...that would essentially be what ExternalFileFiled should start doing, it 
just hasn't had anyone bite the bullet to implement it yet -- if you wnat 
to tackle that, then i would suggest/request/encourage you to look at 
doing it as a patch to ExternalFileField that could be contributed back 
and reused by all.

With all of that said: there has also been a recent contribution of a 
"MoneyFieldType" for dealing precisesly with multicurrency 
sorting/filtering issues that you should definitley take a look at...

https://issues.apache.org/jira/browse/SOLR-2202

-Hoss


Re: Looking for Developers

2010-10-28 Thread rajini maski
Its better if we can make some solr-job list.. that would be better.. if
not,
 chances that this mailing list of solr queries become less of that and more
lik job forum.. this mailing list is so uselful to all developers to get
answers for their techinical queries..


On Thu, Oct 28, 2010 at 11:30 PM, Stefan Moises  wrote:

> Well, I don't see a problem sending (serious) job offers to this list... as
> long as nobody spams
>
> just my 2c
> Stefan
>
> Am 28.10.2010 19:57, schrieb Ravi Gidwani:
>
> May I suggest a new mailing list like solr-jobs (if it does not exist) or
>> something for such emails ? I think it is also important for the solr
>> developers to get emails about job opportunities ? No ?
>>
>> ~Ravi.
>>
>> On Tue, Oct 26, 2010 at 11:42 PM, Pradeep Singh
>>  wrote:
>>
>> This is the second time he has sent this shit. Kill his subscription. Is
>>> it
>>> possible?
>>>
>>> On Tue, Oct 26, 2010 at 10:38 PM, Yuchen Wang  wrote:
>>>
>>> UNSUBSCRIBE

 On Tue, Oct 26, 2010 at 10:15 PM, Igor Chudov
  wrote:

 UNSUBSCRIBE
>
> On Wed, Oct 27, 2010 at 12:14 AM, ST ST  wrote:
>
>> Looking for Developers Experienced in Solr/Lucene And/OR FAST Search
>>
> Engines
>
>> from India (Pune)
>>
>> We are looking for off-shore India Based Developers who are
>>
> proficient
>>>
 in

> Solr/Lucene and/or FAST search engine .
>> Developers in the cities of Pune/Bombay in India are preferred.
>>
> Development
>
>> is for projects based in US for a reputed firm.
>>
>> If you are proficient in Solr/Lucene/FAST and have 5 years minimum
>>
> industry
>
>> experience with atleast 3 years in Search Development,
>> please send me your resume.
>>
>> Thanks
>>
>>
> --
> ***
> Stefan Moises
> Senior Softwareentwickler
>
> shoptimax GmbH
> Guntherstraße 45 a
> 90461 Nürnberg
> Amtsgericht Nürnberg HRB 21703
> GF Friedrich Schreieck
>
> Tel.: 0911/25566-25
> Fax:  0911/25566-29
> moi...@shoptimax.de
> http://www.shoptimax.de
> ***
>
>


Re: how well does multicore scale?

2010-10-28 Thread Dennis Gearon
This is why using 'groups' as intermidiary permission objects came into 
existence in databases.

Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better idea to learn from others’ mistakes, so you do not have to make them 
yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
  otherwise we all die.


--- On Wed, 10/27/10, mike anderson  wrote:

> From: mike anderson 
> Subject: Re: how well does multicore scale?
> To: solr-user@lucene.apache.org
> Date: Wednesday, October 27, 2010, 5:20 AM
> Tagging every document with a few
> hundred thousand 6 character user-ids
> would  increase the document size by two orders of
> magnitude. I can't
> imagine why this wouldn't mean the index would increase by
> just as much
> (though I really don't know much about that file
> structure). By my simple
> math, this would mean that if we want each shard's index to
> be able to fit
> in memory, then (even with some beefy servers) each query
> would have to go
> out to a few thousand shards (as opposed to 21 if we used
> the MultiCore
> approach). This means the typical response time would be
> much slower.
> 
> 
> -mike
> 
> On Tue, Oct 26, 2010 at 10:15 AM, Jonathan Rochkind wrote:
> 
> > mike anderson wrote:
> >
> >> I'm really curious if there is a clever solution
> to the obvious problem
> >> with: "So your better off using a single index and
> with a user id and use
> >> a query filter with the user id when fetching
> data.", i.e.. when you have
> >> hundreds of thousands of user IDs tagged on each
> article. That just
> >> doesn't
> >> sound like it scales very well..
> >>
> >>
> > Actually, I think that design would scale pretty fine,
> I don't think
> > there's an 'obvious' problem. You store your userIDs
> in a multi-valued field
> > (or as multiple terms in a single value, ends up being
> similar). You fq on
> > there with the current
> userID.   There's one way to find out of
> course, but
> > that doesn't seem a patently ridiculous scenario or
> anything, that's the
> > kind of thing Solr is generally good at, it's what
> it's built for.   The
> > problem might actually be in the time it takes to add
> such a document to the
> > index; but not in query time.
> >
> > Doesn't mean it's the best solution for your problem
> though, I can't say.
> >
> > My impression is that Solr in general isn't really
> designed to support the
> > kind of multi-tenancy use case people are talking
> about lately.  So trying
> > to make it work anyway... if multi-cores work for you,
> then great, but be
> > aware they weren't really designed for that (having
> thousands of cores) and
> > may not. If a single index can work for you instead,
> great, but as you've
> > discovered it's not neccesarily obvious how to set up
> the schema to do what
> > you need -- really this applies to Solr in general,
> unlike an rdbms where
> > you just third-form-normalize everything and figure
> it'll work for almost
> > any use case that comes up,  in Solr you
> generally need to custom fit the
> > schema for your particular use cases, sometimes being
> kind of clever to
> > figure out the optimal way to do that.
> >
> > This is, I'd argue/agree, indeed kind of a
> disadvantage, setting up a Solr
> > index takes more intellectual work than setting up an
> rdbms. The trade off
> > is you get speed, and flexible ways to set up
> relevancy (that still perform
> > well). Took a couple decades for rdbms to get as
> brainless to use as they
> > are, maybe in a couple more we'll have figured out
> ways to make indexing
> > engines like solr equally brainless, but not yet --
> but it's still pretty
> > damn easy for what it is, the lucene/Solr folks have
> done a remarkable job.
> >
>


Re: Possible bug in query sorting

2010-10-28 Thread Gora Mohanty
On Thu, Oct 28, 2010 at 5:18 PM, Michael McCandless
 wrote:
> Is it somehow possible that you are trying to sort by a multi-valued field?
[...]

Either that, or or your field gets processed into multiple tokens via the
analyzer/tokenizer path in your schema. The reported error is a
consequence of the fact that different documents might result in a
different number of tokens.

Please show us the part of schema.xml that defines the field type for
the field "title".

Regards,
Gora


Re: Looking for Developers

2010-10-28 Thread Michael McCandless
I don't think we should do this until it becomes a "real" problem.

The number of job offers is tiny compared to dev emails, so far, as
far as I can tell.

Mike

On Thu, Oct 28, 2010 at 2:10 PM, rajini maski  wrote:
> Its better if we can make some solr-job list.. that would be better.. if
> not,
>  chances that this mailing list of solr queries become less of that and more
> lik job forum.. this mailing list is so uselful to all developers to get
> answers for their techinical queries..
>
>
> On Thu, Oct 28, 2010 at 11:30 PM, Stefan Moises  wrote:
>
>> Well, I don't see a problem sending (serious) job offers to this list... as
>> long as nobody spams
>>
>> just my 2c
>> Stefan
>>
>> Am 28.10.2010 19:57, schrieb Ravi Gidwani:
>>
>> May I suggest a new mailing list like solr-jobs (if it does not exist) or
>>> something for such emails ? I think it is also important for the solr
>>> developers to get emails about job opportunities ? No ?
>>>
>>> ~Ravi.
>>>
>>> On Tue, Oct 26, 2010 at 11:42 PM, Pradeep Singh
>>>  wrote:
>>>
>>> This is the second time he has sent this shit. Kill his subscription. Is
 it
 possible?

 On Tue, Oct 26, 2010 at 10:38 PM, Yuchen Wang  wrote:

 UNSUBSCRIBE
>
> On Tue, Oct 26, 2010 at 10:15 PM, Igor Chudov
>  wrote:
>
> UNSUBSCRIBE
>>
>> On Wed, Oct 27, 2010 at 12:14 AM, ST ST  wrote:
>>
>>> Looking for Developers Experienced in Solr/Lucene And/OR FAST Search
>>>
>> Engines
>>
>>> from India (Pune)
>>>
>>> We are looking for off-shore India Based Developers who are
>>>
>> proficient

> in
>
>> Solr/Lucene and/or FAST search engine .
>>> Developers in the cities of Pune/Bombay in India are preferred.
>>>
>> Development
>>
>>> is for projects based in US for a reputed firm.
>>>
>>> If you are proficient in Solr/Lucene/FAST and have 5 years minimum
>>>
>> industry
>>
>>> experience with atleast 3 years in Search Development,
>>> please send me your resume.
>>>
>>> Thanks
>>>
>>>
>> --
>> ***
>> Stefan Moises
>> Senior Softwareentwickler
>>
>> shoptimax GmbH
>> Guntherstraße 45 a
>> 90461 Nürnberg
>> Amtsgericht Nürnberg HRB 21703
>> GF Friedrich Schreieck
>>
>> Tel.: 0911/25566-25
>> Fax:  0911/25566-29
>> moi...@shoptimax.de
>> http://www.shoptimax.de
>> ***
>>
>>
>


Re: Looking for Developers

2010-10-28 Thread Ken Stanley
On Thu, Oct 28, 2010 at 2:57 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> I don't think we should do this until it becomes a "real" problem.
>
> The number of job offers is tiny compared to dev emails, so far, as
> far as I can tell.
>
> Mike
>
>
By the time that it becomes a real problem, it would be too late to get
people to stop spamming the -user mailing list; no?

- Ken


RE: Looking for Developers

2010-10-28 Thread Sharp, Jonathan

http://www.rhyolite.com/anti-spam/you-might-be.html#spammers-are-stupid-3


-
SECURITY/CONFIDENTIALITY WARNING:  
This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to receive further communications via e-mail, please reply to this message 
and inform the sender that you do not wish to receive further e-mail from the 
sender. 

-



Consulting in Solr tuning, stop words, dictionary, etc

2010-10-28 Thread Dennis Gearon
Speaking of jobs on this list . . . .

How much does a good consultant for Solr work cost? 

I am interested first in English, but then in other languages around the world. 
Just need budgetary amounts for a business plan.

1-6mos, or till BIG DOLLARS, whichever comes first ;-)


Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better idea to learn from others’ mistakes, so you do not have to make them 
yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
  otherwise we all die.


Re: Looking for Developers

2010-10-28 Thread Dennis Gearon
Hey! I represent those remarks! I was on that committee (really) because I 
am/was a:

   http://www.rhyolite.com/anti-spam/you-might-be.html#spam-fighter

  and about 20 other 'types' on that list. I'm a little bit more mature, but 
only a little. White lists are the only way to go.

  
Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better idea to learn from others’ mistakes, so you do not have to make them 
yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
  otherwise we all die.


--- On Thu, 10/28/10, Ken Stanley  wrote:

> From: Ken Stanley 
> Subject: Re: Looking for Developers
> To: solr-user@lucene.apache.org
> Date: Thursday, October 28, 2010, 12:33 PM
> On Thu, Oct 28, 2010 at 2:57 PM,
> Michael McCandless <
> luc...@mikemccandless.com>
> wrote:
> 
> > I don't think we should do this until it becomes a
> "real" problem.
> >
> > The number of job offers is tiny compared to dev
> emails, so far, as
> > far as I can tell.
> >
> > Mike
> >
> >
> By the time that it becomes a real problem, it would be too
> late to get
> people to stop spamming the -user mailing list; no?
> 
> - Ken
>


Upgrading from Solr 1.2 to 1.4.1

2010-10-28 Thread johnmunir

I'm using Solr 1.2.  If I upgrade to 1.4.1, must I re-index because of 
LUCENE-1142?  If so, how will this affect me if I don’t re-index (I'm using 
EnglishPorterFilterFactory)?  What about when I’m using non-English stammers 
from Snowball?
 
Beside the brief note "IMPORTANT UPGRADE NOTE" about this in CHANGES.txt, where 
can I read more about this?  I looked in JIRA, LUCENE-1142, there isn't much.
 
-M


Re: Use SolrCloud (SOLR-1873) on trunk, or with 1.4.1?

2010-10-28 Thread Jan Høydahl / Cominvent
Hi,

I would aim for reindexing on branch3_x, which will be the 3.1 release soon. I 
don't know if SOLR-1873 applies cleanly to 3_x now, but it would surely be less 
effort to have it apply to 3_x than to 1.4. Perhaps you can help backport the 
patch to 3_x?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 28. okt. 2010, at 03.04, Jeremy Hinegardner wrote:

> Hi all,
> 
> I see that as of r1022188 Solr Cloud has been committed to trunk.
> 
> I was wondering about the stability of Solr Cloud on trunk.  We are
> planning to do a major reindexing soon (within 30 days), several billion docs,
> and would like to switch to a Solr Cloud based infrastructure. 
> 
> We are wondering should use trunk as it is now that SOLR-1873 is applied, or
> should we take SOLR-1873 and apply it to Solr 1.4.1.
> 
> Has anyone used 1.4.1 + SOLR-1873?  In production?
> 
> Thanks,
> 
> -jeremy
> 
> -- 
> 
> Jeremy Hinegardner  jer...@hinegardner.org 
> 



Re: Upgrading from Solr 1.2 to 1.4.1

2010-10-28 Thread Robert Muir
On Thu, Oct 28, 2010 at 4:44 PM,   wrote:
>
> I'm using Solr 1.2.  If I upgrade to 1.4.1, must I re-index because of 
> LUCENE-1142?  If so, how will this affect me if I don’t re-index (I'm using 
> EnglishPorterFilterFactory)?  What about when I’m using non-English stammers 
> from Snowball?
>
> Beside the brief note "IMPORTANT UPGRADE NOTE" about this in CHANGES.txt, 
> where can I read more about this?  I looked in JIRA, LUCENE-1142, there isn't 
> much.

I haven't looked in detail regarding these changes, but the snowball
was upgraded to revision 500 here.
you can see the revisions/logs of the various algorithms here:
http://svn.tartarus.org/snowball/trunk/snowball/algorithms/?pathrev=500

One problem being, i don't know the previous revision you were
using...but since it had no Hungarian before LUCENE-1142, it couldnt
have possibly been any *later* than revision 385:

Revision 385 - Directory Listing
Added Mon Sep 4 14:06:56 2006 UTC (4 years, 1 month ago) by martin
New Hungarian stemmer

This means, for example, that you would certainly be affected by
changes in the english stemmer such as revision 414, among others:

Revision 414 - Directory Listing
Modified Mon Nov 20 10:49:29 2006 UTC (3 years, 11 months ago) by martin
'arsen' as exceptional p1 position, to prevent 'arsenic' and
'arsenal' conflating

In my opinion, it would be best to re-index.


Reverse range search

2010-10-28 Thread kenf_nc

Doing a range search is straightforward. I have a fixed value in a document
field, I search on [x TO y] and if the fixed value is in the range requested
it gets a hit. But, what if I have data in a document where there is a min
value and a max value and my query is a fixed value and I want to get a hit
if the query value is in that range. For example:

Solr Doc1:
field  min_price:100
field  max_price:500

Solr Doc2:
field  min_price:300
field  max_price:500

and my query is price:250. I could create a query of (min_price:[* TO 250]
AND max_price:[250 TO *]) and that should work. It should find only doc 1.
However, if I have several fields like this and complex queries that include
most of those fields, it becomes a very ugly query. Ideally I'd like to do
something similar to what the spatial contrib guys do where they make
lat/long a single point. If I had a min/max field, I could call it Price
(100, 500) or Price (300,500) and just do a query of  Price:250 and Solr
would see if 250 was in the appropriate range.

Looong question short...Is there something out there already that does this?
Does anyone else do something like this and have some suggestions?
Thanks,
Ken
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Reverse-range-search-tp1789135p1789135.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: spellcheck component does not work with request handler

2010-10-28 Thread abhayd

hi thanks.. It worked.!!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-component-does-not-work-with-request-handler-tp1786079p1789163.html
Sent from the Solr - User mailing list archive at Nabble.com.


spellchecker results not as desired

2010-10-28 Thread abhayd

hi 

I added spellchecker to request handler. Spellchecker is indexed based.
Terms in index are like
iphone
iphone 4
iphone case
phone
gophoe

when i set q=iphole i get suggestions like
iphone
phone
gophone
ipad

Not sure how would i get iphone, iphone 4, iphone case, phone. Any thoughts?

At the same time when i type ipj
i get result as ipad, why not iphone, iphone 4 , ipad
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/spellchecker-results-not-as-desired-tp1789192p1789192.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searching with wrong keyboard layout or using translit

2010-10-28 Thread Alexander Kanarsky
Pavel,

it depends on size of your documents corpus, complexity and types of
the queries you plan to use etc. I would recommend you to search for
the discussions on synonyms expansion in Lucene (index time vs. query
time tradeoffs etc.) since your problem is quite similar to that
(think Moskva vs. Moskwa). Unless you have a small corpus, I would go
with the second approach and expand the terms during the query time.
However, the first approach might be useful, too: say, you may want to
boost the score for the documents that naturally contain the word
'Moskva', so such a documents will be at the top of the result list.
Having both forms indexed will allow you to achieve this easily by
utilizing Solr's dismax query (to boost the results from the field
with the original terms):
http://localhost:8983/solr/select/?q=Moskva&defType=dismax&qf=text^10.0+text_translit^0.1
('text' field has the original Cyrillic tokens, 'text_translit' is for
transliterated ones)

-Alexander


2010/10/28 Pavel Minchenkov :
> Alexander,
>
> Thanks,
> What variat has better performance?
>
>
> 2010/10/28 Alexander Kanarsky 
>
>> Pavel,
>>
>> I think there is no single way to implement this. Some ideas that
>> might be helpful:
>>
>> 1. Consider adding additional terms while indexing. This assumes
>> conversion of Russian text to both "translit" and "wrong keyboard"
>> forms and index converted terms along with original terms (i.e. your
>> Analyzer/Filter should produce Moskva and Vjcrdf for term Москва). You
>> may re-use the same field (if you plan for a simple term queries) or
>> create a separate fields for the generated terms (better for phrase,
>> proximity queries etc. since it keeps the original text positional
>> info). Then the query could use any of these forms to fetch the
>> document. If you use separate fields, you'll need to expand/create
>> your query to search for them, of course.
>> 2. If you have to index just an original Russian text, you might
>> generate all term forms while analyzing the query, then you could
>> treat the converted terms as a synonyms and use the combination of
>> TermQuery for all term forms or the MultiPhraseQuery for the phrases.
>> For Solr in this case you probably will need to add a custom filter
>> similar to SynonymFilter.
>>
>> Hope this helps,
>> -Alexander
>>
>> On Wed, Oct 27, 2010 at 1:31 PM, Pavel Minchenkov 
>> wrote:
>> > Hi,
>> >
>> > When I'm trying to search Google with wrong keyboard layout -- it
>> corrects
>> > my query, example: http://www.google.ru/search?q=vjcrdf (I typed word
>> > "Moscow" in Russian but in English keyboard layout).
>> > Also, when I'm searching using
>> > translit, It does the same: http://www.google.ru/search?q=moskva
>> >
>> > What is the right way to implement this feature in Solr?
>> >
>> > --
>> > Pavel Minchenkov
>> >
>>
>
>
>
> --
> Pavel Minchenkov
>


Re: Keeping "qt" parameter in distributed search

2010-10-28 Thread Shawn Heisey

On 10/28/2010 12:02 PM, Chris Hostetter wrote:

I'm not very knowledgeable about how distributed searching deals with
request handlers, url paths, and the qt param (i have no idea why the
exact same handler isn't propograted to the remote shards by default -- i
thought it was, but your email suggests that it isn't) but i have seen a
"shards.qt" param mentioned on the list and in some tests.

It doesn't appear to be documented on the wiki anywhere, but you may want
to search for it and see if it helps your situation.


Thank you, Hoss, that was the secret ingredient!  It looks much better now.



Re: documentCache clarification

2010-10-28 Thread Chris Hostetter

: the documentCache: "(Note: This cache cannot be used as a source for
: autowarming because document IDs will change when anything in the
: index changes so they can't be used by a new searcher.)"
: 
: Can anyone elaborate a bit on that. I think I've read it at least 10
: times and I'm still unable to draw a mental picture. I'm wondering if
: the document IDs referred to are the ones I'm defining in my schema,
: or are they the underlying lucene ids, i.e. the ones that, according
: to the Lucene in Action book, are "relative within each segment"?

they are the underlying lucene docIds that change as segments are merged.

: queryResultCache. However, if I issue a request with rows=10, I will
: get an insert, and then a later request for rows=500 would re-use and
: update that original cached docList object. Right? And would it be
: updated with the full list of 500 ordered doc ids or only 200?

note quite.

The queryResultCache is keyed on  and the 
value is a "DocList" object ...

http://lucene.apache.org/solr/api/org/apache/solr/search/DocList.html

Unlike the Document objects in the documentCache, the DocLists in the 
queryResultCache never get modified (techincally Solr doesn't actually 
modify the Documents either, the Document just keeps track of it's fields 
and updates itself as Lazy Load fields are needed)

if a DocList containing results 0-10 is put in the cache, it's not 
going to be of any use for a query with start=50.  but if it contains 0-50 
it *can* be used if start < 50 and rows < 50 -- that's where the 
queryResultWindowSize comes in.  if you use start=0&rows=10, but your 
window size is 50, SolrIndexSearcher will (under the covers) use 
start=0&rows=50 and put that in the cache, returning a "slice" from 0-10 
for your query.  the next query asking for 10-20 will be a cache hit.


-Hoss


Re: Searching for terms on specific fields

2010-10-28 Thread Chris Hostetter

The specifics of your overall goal confuse me a bit, but drilling down to 
your core question...

: I want to be able to use the dismax parser to search on both terms
: (assigning slops and tie breaks). I take it the 'fq' is a candidate for
: this,but can I add dismax capabilities to fq as well? Also my query would be

...you can use any parser you want for fq, using the localparams syntax...

   http://wiki.apache.org/solr/LocalParams

..so you could have something like...

   q=foo:bar&fq={!dismax qf='yak zak'}baz

..the one thing you have to watch out for when using localparams and 
dismax is that the outer params are inherited by the inner params by 
default -- so if you are using dismax for your main query 'q' (with 
defType) and you have global params for qf, pf, bq, etc... those are 
inherited by your fq={!dismax} query unless you override them with local 
params


-Hoss


Ensuring stable timestamp ordering

2010-10-28 Thread Michael Sokolov
I'm curious what if any guarantees there are regarding the "timestamp" field
that's defined in the sample solr schema.xml.  Just for completeness, the
definition is:



RE: documentCache clarification

2010-10-28 Thread Jonathan Rochkind
This is a great explanation, thanks.  I'm going to add it to the wiki somewhere 
that seems relevant, if no-one minds and the wiki lets me. 

From: Chris Hostetter [hossman_luc...@fucit.org]
Sent: Thursday, October 28, 2010 7:27 PM
To: solr-user@lucene.apache.org
Subject: Re: documentCache clarification

: the documentCache: "(Note: This cache cannot be used as a source for
: autowarming because document IDs will change when anything in the
: index changes so they can't be used by a new searcher.)"
:
: Can anyone elaborate a bit on that. I think I've read it at least 10
: times and I'm still unable to draw a mental picture. I'm wondering if
: the document IDs referred to are the ones I'm defining in my schema,
: or are they the underlying lucene ids, i.e. the ones that, according
: to the Lucene in Action book, are "relative within each segment"?

they are the underlying lucene docIds that change as segments are merged.

: queryResultCache. However, if I issue a request with rows=10, I will
: get an insert, and then a later request for rows=500 would re-use and
: update that original cached docList object. Right? And would it be
: updated with the full list of 500 ordered doc ids or only 200?

note quite.

The queryResultCache is keyed on  and the
value is a "DocList" object ...

http://lucene.apache.org/solr/api/org/apache/solr/search/DocList.html

Unlike the Document objects in the documentCache, the DocLists in the
queryResultCache never get modified (techincally Solr doesn't actually
modify the Documents either, the Document just keeps track of it's fields
and updates itself as Lazy Load fields are needed)

if a DocList containing results 0-10 is put in the cache, it's not
going to be of any use for a query with start=50.  but if it contains 0-50
it *can* be used if start < 50 and rows < 50 -- that's where the
queryResultWindowSize comes in.  if you use start=0&rows=10, but your
window size is 50, SolrIndexSearcher will (under the covers) use
start=0&rows=50 and put that in the cache, returning a "slice" from 0-10
for your query.  the next query asking for 10-20 will be a cache hit.


-Hoss


RE: Ensuring stable timestamp ordering

2010-10-28 Thread Michael Sokolov
(Sorry - fumble finger sent too soon.)


My confusion stems from the fact that in my test I insert a number of
documents, and then retrieve them ordered by timestamp, and they don't come
back in the same order they were inserted (the order seems random), unless I
commit after each insert. 

Is that expected?  I could create my own timestamp values easily enough, but
would just as soon not do so if I could use a pre-existing feature that
seems tailor-made.

-Mike

> -Original Message-
> From: Michael Sokolov [mailto:soko...@ifactory.com] 
> Sent: Thursday, October 28, 2010 9:55 PM
> To: 'solr-user@lucene.apache.org'
> Subject: Ensuring stable timestamp ordering
> 
> I'm curious what if any guarantees there are regarding the 
> "timestamp" field that's defined in the sample solr 
> schema.xml.  Just for completeness, the definition is:
> 

   
   



Re: How to use polish stemmer - Stempel - in schema.xml?

2010-10-28 Thread Bernd Fehling
Hi Jakub,

I have ported the KStemmer for use in most recent Solr trunk version.
My stemmer is located in the lib directory of Solr "solr/lib/KStemmer-2.00.jar"
because it belongs to Solr.

Write it as FilterFactory and use it as Filter like:


This is how my fieldType looks like:


  






  
  






  


Regards,
Bernd



Am 28.10.2010 14:56, schrieb Jakub Godawa:
> Hi!
> There is a polish stemmer http://www.getopt.org/stempel/ and I have
> problems connecting it with solr 1.4.1
> Questions:
> 
> 1. Where EXACTLY do I put "stemper-1.0.jar" file?
> 2. How do I register the file, so I can build a fieldType like:
> 
> 
>   
> 
> 
> 3. Is that the right approach to make it work?
> 
> Thanks for verbose explanation,
> Jakub.


Exception while processing: attach document

2010-10-28 Thread Bac Hoang

 Hello all,

I'm getting stuck when trying to import oracle DB to solr index, could 
any one of you give a hand. Thanks million.


Below is some short info. that might be a question

My Sorl: 1.4.1

 *LOG *
INFO: Starting Full Import
Oct 29, 2010 1:19:35 PM org.apache.solr.handler.dataimport.SolrWriter 
readIndexerProperties

INFO: Read dataimport.properties
Oct 29, 2010 1:19:35 PM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity attach with URL: 
jdbc:oracle:thin:@192.168.72.7:1521:OFIRDS22
Oct 29, 2010 1:19:36 PM org.apache.solr.handler.dataimport.DocBuilder 
buildDocument
*SEVERE: Exception while processing: attach document *: 
SolrInputDocument[{}]
org.apache.solr.handler.dataimport.DataImportHandlerException: *Unable 
to execute query: *select * from /A.B/ Processing Document # 1


where A: a schema
B: a table

 *dataSource *===
 url="jdbc:oracle:thin:@192.168.72.7:1521:OFIRDS22" user="abc" 
password="xyz"

 readOnly="true" autoCommit="false" batchSize="1"/>


format="text">






where TOPIC is a filed of table B

Thanks again



Re: Looking for Developers

2010-10-28 Thread 朱炎詹
When I first saw this particular email, I wrote a letter intend to ask the 
sender remove solr-user from its recepient cause I thought this should go to 
solr-dev. But then I thought again, it's about 'job-offer' not 'development 
of Solr', I just delete my email.


Maybe solr-job is a good suggestion. A selfish reason pro this suggestion is 
that I'm also looking for some one familiar with Solr to work for me in 
Taiwan & I really don't know where to ask.


Scott

- Original Message - 
From: "Dennis Gearon" 

To: ; 
Sent: Friday, October 29, 2010 4:28 AM
Subject: Re: Looking for Developers


Hey! I represent those remarks! I was on that committee (really) because I 
am/was a:


  http://www.rhyolite.com/anti-spam/you-might-be.html#spam-fighter

 and about 20 other 'types' on that list. I'm a little bit more mature, but 
only a little. White lists are the only way to go.



Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better idea to learn from others’ mistakes, so you do not have to make them 
yourself. from 
'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
 otherwise we all die.


--- On Thu, 10/28/10, Ken Stanley  wrote:


From: Ken Stanley 
Subject: Re: Looking for Developers
To: solr-user@lucene.apache.org
Date: Thursday, October 28, 2010, 12:33 PM
On Thu, Oct 28, 2010 at 2:57 PM,
Michael McCandless <
luc...@mikemccandless.com>
wrote:

> I don't think we should do this until it becomes a
"real" problem.
>
> The number of job offers is tiny compared to dev
emails, so far, as
> far as I can tell.
>
> Mike
>
>
By the time that it becomes a real problem, it would be too
late to get
people to stop spamming the -user mailing list; no?

- Ken









___b___J_T_f_r_C
Checked by AVG - www.avg.com
Version: 9.0.865 / Virus Database: 271.1.1/3223 - Release Date: 10/28/10 
03:12:00