Re: Search with punctuations

2013-07-15 Thread kobe.free.wo...@gmail.com
Hi Erick,

Thanks for your reply!

I have tried both of the suggestions that you have mentioned i.e.,

1. Using WhitespaceTokensizerFactory
2. Using WordDelimiterFilterFactory with
catenateWords=1

But, I still face the same issue. Should the tokenizers/ factories used must
be the same for both query and index analyzers?

As per my scenario, when I search for INTL, I want SOLR to return both the
records containing string like INTL and INT'L.

Please do suggest me other alternatives to achieve this.

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510p4077973.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Custom processing in Solr Request Handler plugin and its debugging ?

2013-07-15 Thread Tony Mullins
Ok Thanks Erick, for your help.

Tony.


On Sun, Jul 14, 2013 at 5:12 PM, Erick Erickson erickerick...@gmail.comwrote:

 Not sure how to do the pass to another request handler thing, but
 the debugging part is pretty straightforward. I use IntelliJ, but as far
 as I know Eclipse has very similar capabilities.

 First, I cheat and path to the jar that's the output from my IDE, that
 saves copying the jar around. So my solrconfig.xml file has  a lib
 directive like
 ../../../../../eoe/project/out/artifact/jardir
 where this is wherever your IDE wants to put it. It can sometimes be
 tricky to get enough ../../../ in there.

 Second, edit config, select remote and a form comes up. Fill
 in host and port, something like localhost and 5900 (this latter
 is whatever you want. In IntelliJ that'll give you the specific command
 to use to start Solr so you can attach. This looks like the following
 for my setup:
 java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5900
 -jar start.jar

 Now just fire up Solr as above. Fire up your remote debugging
 session in IntelliJ. Set breakpoints as you wish. NOT: the suspend=y
 bit above means that Solr will do _nothing_ until you attach the
 debugger and hit go

 HTH
 Erick

 On Sat, Jul 13, 2013 at 6:57 AM, Tony Mullins tonymullins...@gmail.com
 wrote:
  Please any help on how to pass the search request to different
  RequestHandler from within the custom RequestHandler and how to debug the
  custom RequestHandler plugin ?
 
  Thanks,
  Tony
 
 
  On Fri, Jul 12, 2013 at 4:41 PM, Tony Mullins tonymullins...@gmail.com
 wrote:
 
  Hi,
 
  I have defined my new Solr RequestHandler plugin like this in
  SolrConfig.xml
 
  requestHandler name=/myendpoint class=com.abc.MyRequestPlugin
  /requestHandler
 
  And its working fine.
 
  Now I want to do some custom processing from my this plugin by making a
  search query to regular '/select' handler.
   requestHandler name=/select class=solr.SearchHandler
   
  /requestHandler
 
  And then receive the results back from '/select' handler and perform
 some
  custom processing on those results and send the response back to my
 custom
  /myendpoint handler.
 
  And for this I need help on how to make a call to '/select' handler from
  within the .MyRequestPlugin class and perform some calculation on the
  results.
 
  I also need some help on how to debug my plugin ? As its .jar is been
  deployed to solr_hom/lib ... how can I attach my plugin's code in
 eclipse
  to Solr process so I could debug it when user will send request to my
  plugin.
 
  Thanks,
  Tony
 



Doc's FunctionQuery result field in my custom SearchComponent class ?

2013-07-15 Thread Tony Mullins
Hi,

I have extended Solr's SearchComonent class and I am iterating through all
the docs in ResponseBuilder in @overrider Process() method.

Here I want to get the value of FucntionQuery result but in Document object
I am only seeing the standard field of document not the FucntionQuery
result.

This is my query

http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

Result of above query in browser shows me that 'freq' is part of doc but
its not there in Document object in my @overrider Process() method.

How can I get the value of FunctionQuery result in my custom
SearchComponent ?

Thanks,
Tony


facet filtering

2013-07-15 Thread Daniel Rosher
Hi,

How can I have faceting on a subset of the query docset e.g. with something
akin to:

SimpleFacets.base =
SolrIndexSearcher.getDocSet(
Query mainQuery,
SolrIndexSearcher.getDocSet(Query filter)
)

Is there anything like facet.fq?

Cheers,
Dan


Getting numDocs and pendingDocs in Solr4.3

2013-07-15 Thread Federico Ragona
Hi,

I'm trying to write a validation test that reads some statistics by 
querying
Solr 4.3 via HTTP, namely the number of indexed documents (`numDocs`) 
and the
number of pending documents (`pendingDocs`) from the Solr4 cluster. I 
believe
that in Solr3 there was a `stats.jsp` page thtat offered both numbers.

Is there a way to get both fields in Solr4?

Best regards,
Federico


Re: SolrCloud group.query error shard X did not set sort field values or how i can set fillFields=true on IndexSearcher.search

2013-07-15 Thread Evgeny Salnikov
Thank you!
I really need to eventually increase the number of shards, so I can not
directly use numshards = X and the only way out - splitshards, but then I
encountered the following problem:

1. run empty node1
java -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -jar start.jar -DnumShards=1
2. run empty node2
java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
3. cluster is - collection1 - shard1 - master (node1) and replica (node2)
4. add some data (10 docs)
5. http://node1:8983/solr/collection1/select?q=*:*
response
lst name=responseHeader
int name=status0/int
int name=QTime5/int
lst name=params
str name=q*:*/str
/lst
/lst
result name=response numFound=10 start=0
doc.../doc
doc.../doc
/result
/response
6. try group.query
http://node1:8983/solr/collection1/select?q=*:*group=truegroup.query=street:%D0%9A%D0%BE%D1%80%D0%BE%D0%BB%D0%B5%D0%B2%D0%B0
response
lst name=responseHeader
int name=status0/int
int name=QTime13/int
lst name=params
str name=q*:*/str
str name=group.querystreet:Королева/str
str name=grouptrue/str
/lst
/lst
lst name=grouped
lst name=street:Королева
int name=matches10/int
result name=doclist numFound=10 start=0
doc
str name=idcdb1c990-d00c-4d2c-95ba-4f496e559be3/str
str name=streetКоролева/str
str name=house7/str
int name=number62/int
str name=ownerСидоров/str
str name=noteДела отлично!/str
long name=_version_1440614179417358336/long
/doc
/result
/lst
/lst
/response
7. try split shard1
http://node1:8983/solr/admin/collections?action=SPLITSHARDcollection=collection1shard=shard1
response
 lst name=responseHeader
  int name=status0/int
  int name=QTime9288/int
 /lst
 lst name=success
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime2441/int
   /lst
   str name=corecollection1_shard1_1_replica1/str
   str
name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime2479/int
   /lst
   str name=corecollection1_shard1_0_replica1/str
   str
name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime5002/int
   /lst
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime5002/int
   /lst
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime141/int
   /lst
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime0/int
   /lst
   str name=corecollection1_shard1_0_replica1/str
   str name=statusEMPTY_BUFFER/str
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime1/int
   /lst
   str name=corecollection1_shard1_1_replica1/str
   str name=statusEMPTY_BUFFER/str
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime2515/int
   /lst
   str name=corecollection1_shard1_1_replica2/str
   str
name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime2554/int
   /lst
   str name=corecollection1_shard1_0_replica2/str
   str
name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime4001/int
   /lst
  /lst
  lst
   lst name=responseHeader
int name=status0/int
int name=QTime4002/int
   /lst
  /lst
 /lst
/response
8. Claster state change to
shard1 - master (inactive),
shard1 - slave (inactive)
shard1_0 - master,
shard1_0 - slave,
shard1_1 - master,
shard1_1 - slave
9. Commit http://node1:8983/solr/collection1/update?commit=true
10. Reload http://node1:8983/solr/collection1/select?q=*:* gives me
different results numFound 5,0,10 (i add 10 docs)
Node2 core info is
collection1 - shard1 - 10 docs
collection1_shard1_0_replica2 - 0 docs
collection1_shard1_1_replica2 - 0 docs
11. I restart node2
   Node2 core info is
   collection1 - shard1 - 10 docs
   collection1_shard1_0_replica2 - 5 docs
   collection1_shard1_1_replica2 - 5 docs
12. http://node1:8983/solr/collection1/select?q=*:* always gives the
correct result - 10 documents

But
http://node1:8983/solr/collection1/select?q=*:*group=truegroup.query=street:%D0%9A%D0%BE%D1%80%D0%BE%D0%BB%D0%B5%D0%B2%D0%B0
returns the familiar error
shard 0 did not set sort field values (FieldDoc.fields is null); you must
pass fillFields=true to IndexSearcher.search on each shard

I somehow did not operate correctly splitshard?


Also, I tried once to indicate the number of shard 2
1. run empty node1
java 

Re: Is it possible to find a leader from a list of cores in solr via java code

2013-07-15 Thread vicky desai
Hi,

I got the solution to the above problem . Sharing the code so that it could
help people in future

PoolingClientConnectionManager poolingClientConnectionManager = new
PoolingClientConnectionManager();
poolingClientConnectionManager.setMaxTotal(2);
poolingClientConnectionManager.setDefaultMaxPerRoute(1);
HttpClient httpClient = (HttpClient)new
DefaultHttpClient(poolingClientConnectionManager);
LBHttpSolrServer lbServer = new LBHttpSolrServer(httpClient);
server = new CloudSolrServer(zkhost, lbServer);
server.setDefaultCollection(collectionName);

Thanks a lot to every1 in the thread chain. Your suggestions helped a lot



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-find-a-leader-from-a-list-of-cores-in-solr-via-java-code-tp4074994p4078012.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr is not responding on deployment in tomcat

2013-07-15 Thread Per Newgro
Hi,

maybe someone here can help me with my solr-4.3.1 issue.

I've successful deployed the solr.war on a tomcat7 instance.
Starting the tomcat with only the solr.war deployed - works nicely.
I can see the admin interface and logs are clean.

If i
deploy my wicket-spring-data-solr based app (using the HttpSolrServer)
after the solr app
without restarting the tomcat
= all is fine to.

I've implemented a ping to see if server is up.

code
private void waitUntilSolrIsAvailable(int i) {
if (i == 0) {
logger.info(Check solr state...);
}
if (i  5) {
throw new RuntimeException(Solr is not avaliable after 
more than 25 secs. Going down now.);
}
if (i  0) {
try {
logger.info(Wait for solr to get alive.);
Thread.currentThread().wait(5000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
try {
i++;
SolrPingResponse r = solrServer.ping();
if (r.getStatus()  0) {
waitUntilSolrIsAvailable(i);
}
logger.info(Solr is alive.);
} catch (SolrServerException | IOException e) {
throw new RuntimeException(e);
}
}
/code

Here i can see log
log
54295 [localhost-startStop-2] INFO  org.apache.wicket.Application  – 
[wicket.project] init: Wicket extensions initializer
INFO  - 2013-07-15 12:07:45.261; 
de.company.service.SolrServerInitializationService; Check solr state...
54505 [localhost-startStop-2] INFO  
de.company.service.SolrServerInitializationService  – Check solr state...
INFO  - 2013-07-15 12:07:45.768; org.apache.solr.core.SolrCore; [collection1] 
webapp=/solr path=/admin/ping params={wt=javabinversion=2} hits=0 status=0 
QTime=20
55012 [http-bio-8080-exec-1] INFO  org.apache.solr.core.SolrCore  – 
[collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} 
hits=0 status=0 QTime=20
INFO  - 2013-07-15 12:07:45.770; org.apache.solr.core.SolrCore; [collection1] 
webapp=/solr path=/admin/ping params={wt=javabinversion=2} status=0 QTime=22
55014 [http-bio-8080-exec-1] INFO  org.apache.solr.core.SolrCore  – 
[collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} 
status=0 QTime=22
INFO  - 2013-07-15 12:07:45.854; 
de.company.service.SolrServerInitializationService; Solr is alive.
55098 [localhost-startStop-2] INFO  
de.company.service.SolrServerInitializationService  – Solr is alive.
/log

But if i
restart the tomcat
with both webapps (solr and wicket)
the solr is not responding on the ping request.

log
INFO  - 2013-07-15 12:02:27.634; org.apache.wicket.Application; 
[wicket.project] init: Wicket extensions initializer
11932 [localhost-startStop-1] INFO  org.apache.wicket.Application  – 
[wicket.project] init: Wicket extensions initializer
INFO  - 2013-07-15 12:02:27.787; 
de.company.service.SolrServerInitializationService; Check solr state...
12085 [localhost-startStop-1] INFO  
de.company.service.SolrServerInitializationService  – Check solr state...
/log

What could that be or how can i get infos where this is stopping?

Thanks for your support
Per


Solr Zookeeper - Too Many file descriptors on network failure

2013-07-15 Thread Ranjith Venkatesan
Hi,

I am having an issue with network failure to one of the node (or many). When
network is down, number of sockets in that machine keeps on increasing, At a
point it throws too many file descriptors exeption.

When network is available before that exception, all the open sockets are
getting closed. and hence the node could able to join to cloud.

But when network is available again, after that exception , node couldnt
able to join to cloud. 


Thanks in advance


RANJITH VENKATESAN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Zookeeper-Too-Many-file-descriptors-on-network-failure-tp4077979.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr-Max connections

2013-07-15 Thread Ranjith Venkatesan
Hi,

I am using solr-4.3.0 with zookeeper-3.4.5. My scenario is, users will
communicate with solr via zookeeper ports. 

*My question is how many users can simultaneously access the solr. *

In zookeeper i configured maxClientCxns, but that is for max connections
from a single host(User??)
 
Note: My assumption is ,maxconnections will be based on Jetty or Tomcat. Is
it so?? If not how to configure maxConnections in solr and zookeeper. 

In my case 1000 users may search simultaneously. And also indexing will also
happen at the same time.

Thanks in advance

Ranjith Venkatesan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Max-connections-tp4078008.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Doc's FunctionQuery result field in my custom SearchComponent class ?

2013-07-15 Thread Tony Mullins
Please any help on how to get the value of 'freq' field in my custom
SearchComponent ?

http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

docstr name=id11/strstr name=typeVideo Games/strstr
name=formatxbox 360/strstr name=productThe Amazing
Spider-Man/strint name=popularity11/intlong
name=_version_1439994081345273856/longint name=freq1/int/doc



Here is my code

DocList docs = rb.getResults().docList;
DocIterator iterator = docs.iterator();
int sumFreq = 0;
String id = null;

for (int i = 0; i  docs.size(); i++) {
try {
int docId = iterator.nextDoc();

   // Document doc = searcher.doc(docId, fieldSet);
Document doc = searcher.doc(docId);

In doc object I can see the schema fields like 'id', 'type','format' etc.
but I cannot find the field 'freq' which I needed. Is there any way to get
the FunctionQuery fields in doc object ?

Thanks,
Tony


On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.comwrote:

 Hi,

 I have extended Solr's SearchComonent class and I am iterating through all
 the docs in ResponseBuilder in @overrider Process() method.

 Here I want to get the value of FucntionQuery result but in Document
 object I am only seeing the standard field of document not the
 FucntionQuery result.

 This is my query


 http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

 Result of above query in browser shows me that 'freq' is part of doc but
 its not there in Document object in my @overrider Process() method.

 How can I get the value of FunctionQuery result in my custom
 SearchComponent ?

 Thanks,
 Tony



Nested query in SOLR filter query (fq)

2013-07-15 Thread EquilibriumCST
Hi all,

I have the following case.

Solr documents has fields -- id and status. Id is not unique. Unique is the
combination of these two elements.
Documents with same id have different statuses.

List of Documents
 
 -ID-  -STATUS-
  id11
  id12
  id13
  id14
  id21
  id22
  id31
  
I need to make query that takes all documents with specific status and to
exclude documents that don't have other specific status. 
As an example I need to get all documents with status 2 and don't have
status 3. 
The expected result should be document :
 id22
 
Another example: all documents with status 1 and don't have status 3. Then
the result should be: 
 id21
 id31
  
Here is my query that don't work
http://192.168.130.14:13080/solr/select/?q=status:1version=2.2start=0rows=10indent=onfl=id,statusfq=-id:(*:*%20AND%20status:2)
The problem is in filter query(fq) part. In fq must be the ids of the
documents with status 2 and if the current document id is in this list to be
excluded.
I guess some subquery must be used in fq part or something else. 
Just for information we are using APACHE SOLR 3.6 and document count is
around 100k.

Thanks in advance!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nested-query-in-SOLR-filter-query-fq-tp4078020.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr caching clarifications

2013-07-15 Thread Erick Erickson
Manuel:

First off, anything that Mike McCandless says about low-level
details should override anything I say. The memory savings
he's talking about there are actually something he tutored me
in once on a chat.

The savings there, as I understand it, aren't huge. For large
sets I think it's a 25% savings (if I calculated right). But consider
that even without those savings, 8 filter cache entries will be
more than the entire structure that JIRA talks about

As to your fq question, absolutely! Any yes/no clause that,
as you say, contribute to the score is a candidate to be
moved to a fq clause. There are a couple of things to
be aware of though.
1 be a little careful of using NOW. If you don't use it correctly,
 fq clauses will not be re-used. See:
 http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/
2 How you usually do this is through the UI, not the users entering
 a query. For instance if you have a date-range picker your a;;
 constructs the fq clause from that. Or you append fq clauses to the
 links you create when you display facets or

No, there's no automatic tool for this. There's not likely to be one
since there's no way to infer the intent. Say you put in a clause like
q=a AND b.
That scores things. It would give the same result set as
q=*:*fq=1fq=b
which would compute no scores. How could a tool infer when this
was or wasn't OK?

Best
Erick

On Sun, Jul 14, 2013 at 6:10 PM, Manuel Le Normand
manuel.lenorm...@gmail.com wrote:
 Alright, thanks Erick. For the question about memory usage of merges, taken
 from  Mike McCandless Blog

 The big thing that stays in RAM is a logical int[] mapping old docIDs to
 new docIDs, but in more recent versions of Lucene (4.x) we use a much more
 efficient structure than a simple int[] ... see
 https://issues.apache.org/jira/browse/LUCENE-2357

 How much RAM is required is mostly a function of how many documents (lots
 of tiny docs use more RAM than fewer huge docs).


 A related clarification
 As my users are not aware of the fq possibility, i was wondering how do I
 make the best out of this field cache. Would if be efficient transforming
 implicitly their query to a filter query on fields that are boolean
 searches (date range etc. that do not affect the score of a document). Is
 this a good practice? Is there any plugin for a query parser that makes it?




 Inline

 On Thu, Jul 11, 2013 at 8:36 AM, Manuel Le Normand
 manuel.lenorm...@gmail.com wrote:
  Hello,
  As a result of frequent java OOM exceptions, I try to investigate more
 into
  the solr jvm memory heap usage.
  Please correct me if I am mistaking, this is my understanding of usages
 for
  the heap (per replica on a solr instance):
  1. Buffers for indexing - bounded by ramBufferSize
  2. Solr caches
  3. Segment merge
  4. Miscellaneous- buffers for Tlogs, servlet overhead etc.
 
  Particularly I'm concerned by Solr caches and segment merges.
  1. How much memory consuming (bytes per doc) are FilterCaches
 (bitDocSet)
  and queryResultCaches (DocList)? I understand it is related to the skip
  spaces between doc id's that match (so it's not saved as a bitmap). But
  basically, is every id saved as a java int?

 Different beasts. filterCache consumes, essentially, maxDoc/8 bytes (you
 can get the maxDoc number from your Solr admin page). Plus some overhead
 for storing the fq text, but that's usually not much. This is for each
 entry up to Size.




 queryResultCache is usually trivial unless you've configured it
 extravagantly.
 It's the query string length + queryResultWindowSize integers per entry
 (queryResultWindowSize is from solrconfig.xml).

  2. QueryResultMaxDocsCached - (for example = 100) means that any query
  resulting in more than 100 docs will not be cached (at all) in the
  queryResultCache? Or does it have to do with the documentCache?
 It's just a limit on the queryResultCache entry size as far as I can
 tell. But again
 this cache is relatively small, I'd be surprised if it used
 significant resources.

  3. DocumentCache - written on the wiki it should be greater than
  max_results*concurrent_queries. Max result is just the num of rows
  displayed (rows-start) param, right? Not the queryResultWindow.

 Yes. This a cache (I think) for the _contents_ of the documents you'll
 be returning to be manipulated by various components during the life
 of the query.

  4. LazyFieldLoading=true - when quering for id's only (fl=id) will this
  cache be used? (on the expense of eviction of docs that were already
 loaded
  with stored fields)

 Not sure, but I don't think this will contribute much to memory pressure.
 This
 is about now many fields are loaded to get a single value from a doc in
 the
 results list, and since one is usually working with 20 or so docs this
 is usually
 a small amount of memory.

  5. How large is the heap used by mergings? Assuming we have a merge of
 10
  segments of 500MB each (half inverted files - *.pos *.doc etc, half 

Re: SolrCloud group.query error shard X did not set sort field values or how i can set fillFields=true on IndexSearcher.search

2013-07-15 Thread Erick Erickson
I'm going to let someone who knows the splitting details
take over G...

Best
Erick

On Mon, Jul 15, 2013 at 5:19 AM, Evgeny Salnikov evg...@salnikoff.com wrote:
 Thank you!
 I really need to eventually increase the number of shards, so I can not
 directly use numshards = X and the only way out - splitshards, but then I
 encountered the following problem:

 1. run empty node1
 java -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -DzkRun -jar start.jar -DnumShards=1
 2. run empty node2
 java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
 3. cluster is - collection1 - shard1 - master (node1) and replica (node2)
 4. add some data (10 docs)
 5. http://node1:8983/solr/collection1/select?q=*:*
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime5/int
 lst name=params
 str name=q*:*/str
 /lst
 /lst
 result name=response numFound=10 start=0
 doc.../doc
 doc.../doc
 /result
 /response
 6. try group.query
 http://node1:8983/solr/collection1/select?q=*:*group=truegroup.query=street:%D0%9A%D0%BE%D1%80%D0%BE%D0%BB%D0%B5%D0%B2%D0%B0
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime13/int
 lst name=params
 str name=q*:*/str
 str name=group.querystreet:Королева/str
 str name=grouptrue/str
 /lst
 /lst
 lst name=grouped
 lst name=street:Королева
 int name=matches10/int
 result name=doclist numFound=10 start=0
 doc
 str name=idcdb1c990-d00c-4d2c-95ba-4f496e559be3/str
 str name=streetКоролева/str
 str name=house7/str
 int name=number62/int
 str name=ownerСидоров/str
 str name=noteДела отлично!/str
 long name=_version_1440614179417358336/long
 /doc
 /result
 /lst
 /lst
 /response
 7. try split shard1
 http://node1:8983/solr/admin/collections?action=SPLITSHARDcollection=collection1shard=shard1
 response
  lst name=responseHeader
   int name=status0/int
   int name=QTime9288/int
  /lst
  lst name=success
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime2441/int
/lst
str name=corecollection1_shard1_1_replica1/str
str
 name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime2479/int
/lst
str name=corecollection1_shard1_0_replica1/str
str
 name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime5002/int
/lst
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime5002/int
/lst
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime141/int
/lst
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime0/int
/lst
str name=corecollection1_shard1_0_replica1/str
str name=statusEMPTY_BUFFER/str
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime1/int
/lst
str name=corecollection1_shard1_1_replica1/str
str name=statusEMPTY_BUFFER/str
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime2515/int
/lst
str name=corecollection1_shard1_1_replica2/str
str
 name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime2554/int
/lst
str name=corecollection1_shard1_0_replica2/str
str
 name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime4001/int
/lst
   /lst
   lst
lst name=responseHeader
 int name=status0/int
 int name=QTime4002/int
/lst
   /lst
  /lst
 /response
 8. Claster state change to
 shard1 - master (inactive),
 shard1 - slave (inactive)
 shard1_0 - master,
 shard1_0 - slave,
 shard1_1 - master,
 shard1_1 - slave
 9. Commit http://node1:8983/solr/collection1/update?commit=true
 10. Reload http://node1:8983/solr/collection1/select?q=*:* gives me
 different results numFound 5,0,10 (i add 10 docs)
 Node2 core info is
 collection1 - shard1 - 10 docs
 collection1_shard1_0_replica2 - 0 docs
 collection1_shard1_1_replica2 - 0 docs
 11. I restart node2
Node2 core info is
collection1 - shard1 - 10 docs
collection1_shard1_0_replica2 - 5 docs
collection1_shard1_1_replica2 - 5 docs
 12. http://node1:8983/solr/collection1/select?q=*:* always gives the
 correct result - 10 documents

 But
 

How to change extracted directory

2013-07-15 Thread wolbi
Hi,
I'm trying to change default tempDir where solr.war file is extracted to. 
If I change context or webbaps XML it works, but I need to do it from
commandline and don't know how. I tried to run:

java -Djava.io.tmpdir=/path/to/my/dir -jar start.jar

or 

java -Djavax.servlet.context.tempdir=/path/to/my/dir -jar start.jar

..without success. I always get default directory:


[main] WARN  org.eclipse.jetty.xml.XmlConfiguration  – Config error at Set
name=tempDirectoryProperty name=jetty.home
default=.//solr-webapp/Set
...

Caused by: java.lang.IllegalArgumentException: Bad temp directory:
/opt/solr/app/solr-webapp
at
org.eclipse.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1127)

and suggestions?

regards





--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-change-extracted-directory-tp4078024.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Search with punctuations

2013-07-15 Thread Erick Erickson
1 You have to re-index after changing your schema, did you?
2 The admin/analysis page is your friend. It'll show you exactly
 what transformations are applied both at query and index time.
3 WhitespaceTokenizerFactory is only _part_ of the solution, it
 just breaks up the incoming. WordDelimiterFilterFactory
 would then be applied to each token.
4 Yes, you must have the index and query time analysis chains
 be compatible. For the time being, identical is probably  best
 as I'm guessing you're not entirely familiar with the process.

Best
Erick

On Mon, Jul 15, 2013 at 2:40 AM, kobe.free.wo...@gmail.com
kobe.free.wo...@gmail.com wrote:
 Hi Erick,

 Thanks for your reply!

 I have tried both of the suggestions that you have mentioned i.e.,

 1. Using WhitespaceTokensizerFactory
 2. Using WordDelimiterFilterFactory with
 catenateWords=1

 But, I still face the same issue. Should the tokenizers/ factories used must
 be the same for both query and index analyzers?

 As per my scenario, when I search for INTL, I want SOLR to return both the
 records containing string like INTL and INT'L.

 Please do suggest me other alternatives to achieve this.

 Thanks!



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510p4077973.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Nested query in SOLR filter query (fq)

2013-07-15 Thread Mikhail Khludnev
Hello,

it sounds like FieldCollapsing or Join scenarios, but given the only
information which you provided, it can be solved by indexing statuses as
multivalue field:
 -ID-  -STATUS-
  id1(1 2 3 4)
  id2(1 2)
  id3(1)

q=*:*fq=STATUS:1fq=NOT STATUS:3




On Mon, Jul 15, 2013 at 3:19 PM, EquilibriumCST valeri_ho...@abv.bg wrote:

 Hi all,

 I have the following case.

 Solr documents has fields -- id and status. Id is not unique. Unique is
 the
 combination of these two elements.
 Documents with same id have different statuses.

 List of Documents

  -ID-  -STATUS-
   id11
   id12
   id13
   id14
   id21
   id22
   id31

 I need to make query that takes all documents with specific status and to
 exclude documents that don't have other specific status.
 As an example I need to get all documents with status 2 and don't have
 status 3.
 The expected result should be document :
  id22

 Another example: all documents with status 1 and don't have status 3. Then
 the result should be:
  id21
  id31

 Here is my query that don't work

 http://192.168.130.14:13080/solr/select/?q=status:1version=2.2start=0rows=10indent=onfl=id,statusfq=-id:(*:*%20AND%20status:2)
 The problem is in filter query(fq) part. In fq must be the ids of the
 documents with status 2 and if the current document id is in this list to
 be
 excluded.
 I guess some subquery must be used in fq part or something else.
 Just for information we are using APACHE SOLR 3.6 and document count is
 around 100k.

 Thanks in advance!



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Nested-query-in-SOLR-filter-query-fq-tp4078020.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Running Solr in a cluster - high availability only

2013-07-15 Thread Mysurf Mail
Hi,
I would like to run two Solr instances on different computers as a cluster.
My main interest is High availability - meaning, in case one server crashes
or is down there will be always another one.

(my performances on a single instance are great. I do not need to split the
data to two servers.)

Questions:
1. What is the best practice?
Is it different than clustering for index splitting? Do I need Shards?
2. Do I need zoo keeper?
3. Is it a container based configuration (different for jetty and tomcat)
4, Do I need an external NLB for that ?
5. When one computer is up after crashing. how dows it updates its index?


Re: HTTP Status 503 - Server is shutting down

2013-07-15 Thread Sandeep Gupta
Hello,

I am able to configure solr 4.3.1 version with tomcat6.

I followed these steps:
1. Extract solr431 package. In my case I did in
E:\solr-4.3.1\example\solr
2. Now copied solr dir from extracted package (E:\solr-4.3.1\example\solr)
into TOMCAT_HOME dir.
In my case TOMCAT_HOME dir is pointed to E:\Apache\Tomcat 6.0.
3. I can refer now SOLR_HOME as  E:\Apache\Tomcat 6.0\solr  (please
remember this)
4. Copy the solr.war file from extracted package to SOLR HOME dir i.e
E:\Apache\Tomcat 6.0\solr. This is required to create the context. As I
donot want to pass this as JAVA OPTS
5. Create solr1.xml file into TOMCAT_HOME\conf\Catalina\localhost (I gave
file name as solr1.xml )

?xml version=1.0 encoding=utf-8?Context
docBase=E:\Apache\Tomcat 6.0\solr\solr.war debug=0
crossContext=true  Environment name=solr/home
type=java.lang.String value=E:\Apache\Tomcat 6.0\solr
override=true//Context

6.  Also copy solr.war file into TOMCAT_HOME\webapps for deployment purpose
7.  If you start tomcat you will get errors as mentioned by Shawn.  S0 you
need to copy all the 5 jar files from solr extracted package (
E:\solr-4.3.1\example\lib\ext ) to TOMCAT_HOME\lib dir.(jul-to-slf4j-1.6.6,
jcl-over-slf4j-1.6.6, slf4j-log4j12-1.6.6, slf4j-api-1.6.6,log4j-1.2.16)
8. Also copy the log4js.properties file  from
E:\solr-4.3.1\example\resources dir to TOMCAT_HOME\lib dir.
9. Now if you start the tomcat you wont having any problem.

10. As in my side I am using additional jar for data import requesthandler
.  So for this please modify the solrconfig.xml file to point the location
of data import jar.
11. What I did :
In solrconfig.xml file :  In section
 !-- lib/ directives can be used to instruct Solr to load an
Jars.
  lib dir=./lib /
--

I add one line after this section (If I use above line then I need to
create lib dir inside Collection1 dir)
lib dir=../lib /

12. In SOLR_HOME (E:\Apache\Tomcat 6.0\solr) I created a lib folder because
in my solrconfig.xml file I am referring this lib dir.
And copied all the dataimport related jar
files.(solr-dataimporthandler-4.3.1***)
I did it in this way because I do not want to use TOMCAT_HOME\lib.
13.  Now restart the tomcat I am sure there should not be any problem. If
there is some problem, refer solr.log file which is in TOMCAT_HOME\logs dir.

As I said in point 12 that I do not want to put jar files related to solr
ino TOMCAT_HOME\lib dir,  but for logging mechanism I have to do. I tried
to put all the 5 jars into this folder and removed from TOMCAT lib.. but
then I got the error.

In Ideal scenario, we should not put all the jar files related to solr into
TOMCAT lib dir

Regards
Sandeep



On Mon, Jul 15, 2013 at 12:27 AM, PeterKerk vettepa...@hotmail.com wrote:

 Ok, still getting the same error HTTP Status 503 - Server is shutting
 down,
 so here's what I did now:

 - reinstalled tomcat
 - deployed solr-4.3.1.war in C:\Program Files\Apache Software
 Foundation\Tomcat 6.0\webapps
 - copied log4j-1.2.16.jar,slf4j-api-1.6.6.jar,slf4j-log4j12-1.6.6.jar to
 C:\Program Files\Apache Software Foundation\Tomcat
 6.0\webapps\solr-4.3.1\WEB-INF\lib
 - copied log4j.properties from
 C:\Dropbox\Databases\solr-4.3.1\example\resources to
 C:\Dropbox\Databases\solr-4.3.1\example\lib
 - restarted tomcat


 Now this shows in my Tomcat console:

 14-jul-2013 20:54:38 org.apache.catalina.core.AprLifecycleListener init
 INFO: The APR based Apache Tomcat Native library which allows optimal
 performanc
 e in production environments was not found on the java.library.path:
 C:\Program
 Files\Apache Software Foundation\Tomcat
 6.0\bin;C:\Windows\Sun\Java\bin;C:\Windo
 ws\system32;C:\Windows;C:\Program Files\Common Files\Microsoft
 Shared\Windows Li
 ve;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows
 Live;C:\Windows\

 system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShe
 ll\v1.0\;C:\Program Files\TortoiseSVN\bin;c:\msxsl;C:\Program Files
 (x86)\Window
 s Live\Shared;C:\Program Files\Microsoft\Web Platform Installer\;C:\Program
 File
 s (x86)\Microsoft ASP.NET\ASP.NET Web Pages\v1.0\;C:\Program Files
 (x86)\Windows
  Kits\8.0\Windows Performance Toolkit\;C:\Program Files\Microsoft SQL
 Server\110
 \Tools\Binn\;C:\Program Files (x86)\Microsoft SQL
 Server\110\Tools\Binn\;C:\Prog
 ram Files\Microsoft SQL Server\110\DTS\Binn\;C:\Program Files
 (x86)\Microsoft SQ
 L Server\110\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft
 SQL S
 erver\110\DTS\Binn\;C:\Program Files (x86)\Java\jre6\bin;C:\Program
 Files\Java\j
 re631\bin;.
 14-jul-2013 20:54:39 org.apache.coyote.http11.Http11Protocol init
 INFO: Initializing Coyote HTTP/1.1 on http-8080
 14-jul-2013 20:54:39 org.apache.catalina.startup.Catalina load
 INFO: Initialization processed in 287 ms
 14-jul-2013 20:54:39 org.apache.catalina.core.StandardService start
 INFO: Starting service Catalina
 14-jul-2013 20:54:39 org.apache.catalina.core.StandardEngine start
 INFO: Starting 

Re: Running Solr in a cluster - high availability only

2013-07-15 Thread Jack Krupansky
* Go with SolrCloud - unless you think you're smarter than Yonik and Mark 
Miller.

* Replicas are used for both query capacity and resilience (HA).
* Shards are used for increased index capacity (number of documents) and 
to reduce query latency (parallel processing of portions of a query.)
* You need at least three zookeepers for HA. They need to be external to the 
cluster in production.
* Load balancing - you need to do your own testing to confirm whether you 
need it. If so, that is outside of Solr.

* SolrCloud automatically recovers nodes when they come back up.

-- Jack Krupansky

-Original Message- 
From: Mysurf Mail

Sent: Monday, July 15, 2013 8:32 AM
To: solr-user@lucene.apache.org
Subject: Running Solr in a cluster - high availability only

Hi,
I would like to run two Solr instances on different computers as a cluster.
My main interest is High availability - meaning, in case one server crashes
or is down there will be always another one.

(my performances on a single instance are great. I do not need to split the
data to two servers.)

Questions:
1. What is the best practice?
   Is it different than clustering for index splitting? Do I need Shards?
2. Do I need zoo keeper?
3. Is it a container based configuration (different for jetty and tomcat)
4, Do I need an external NLB for that ?
5. When one computer is up after crashing. how dows it updates its index? 



How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?

2013-07-15 Thread Furkan KAMACI
When I search something which has non ASCII characters at Google it returns
me results both original and ascified versions and *highlights both of them*.
For example if I search *çiğli* at Google first result is that:

*Çiğli* Belediyesi
www.*cigli*.bel.tr/

How can I do that at Solr? How can I indicate that to Solr: *Both Ascified
and Non-Ascii versions of tokens are same?**
*


Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?

2013-07-15 Thread Jack Krupansky
Either do a custom highlighter or preprocess the query and generate an OR 
of the accented and unaccented terms. Solr has no magic feature to do both. 
Sure, you could do a token filter that duplicated each term and included 
both the accented and unaccented versions, but... it gets messy and is a 
pain with phrases.


It is worth a Jira though.

-- Jack Krupansky

-Original Message- 
From: Furkan KAMACI

Sent: Monday, July 15, 2013 9:06 AM
To: solr-user@lucene.apache.org
Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of 
tokens are same?


When I search something which has non ASCII characters at Google it returns
me results both original and ascified versions and *highlights both of 
them*.

For example if I search *çiğli* at Google first result is that:

*Çiğli* Belediyesi
www.*cigli*.bel.tr/

How can I do that at Solr? How can I indicate that to Solr: *Both Ascified
and Non-Ascii versions of tokens are same?**
* 



Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?

2013-07-15 Thread Ahmet Arslan
Hi Furkan,

Using MappingCharFilterFactory with mapping-FoldToASCII.txt or 
mapping-ISOLatin1Accent.txt

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.MappingCharFilterFactory






 From: Furkan KAMACI furkankam...@gmail.com
To: solr-user@lucene.apache.org 
Sent: Monday, July 15, 2013 4:06 PM
Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of 
tokens are same?
 

When I search something which has non ASCII characters at Google it returns
me results both original and ascified versions and *highlights both of them*.
For example if I search *çiğli* at Google first result is that:

*Çiğli* Belediyesi
www.*cigli*.bel.tr/

How can I do that at Solr? How can I indicate that to Solr: *Both Ascified
and Non-Ascii versions of tokens are same?**
*

Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?

2013-07-15 Thread Jack Krupansky
Actually, on second thought, I think you should be able to do this directly, 
but I don't have the highlighter magic at my fingertips. The field type 
analyzer simply needs to map the accented characters; the character 
positions of the accented and unaccented tokens should line up fine. Really, 
it is no different that highlighting tokens that have differences in upper 
and lower case.


-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Monday, July 15, 2013 9:13 AM
To: solr-user@lucene.apache.org
Subject: Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions 
of tokens are same?


Either do a custom highlighter or preprocess the query and generate an OR
of the accented and unaccented terms. Solr has no magic feature to do both.
Sure, you could do a token filter that duplicated each term and included
both the accented and unaccented versions, but... it gets messy and is a
pain with phrases.

It is worth a Jira though.

-- Jack Krupansky

-Original Message- 
From: Furkan KAMACI

Sent: Monday, July 15, 2013 9:06 AM
To: solr-user@lucene.apache.org
Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of
tokens are same?

When I search something which has non ASCII characters at Google it returns
me results both original and ascified versions and *highlights both of
them*.
For example if I search *çiğli* at Google first result is that:

*Çiğli* Belediyesi
www.*cigli*.bel.tr/

How can I do that at Solr? How can I indicate that to Solr: *Both Ascified
and Non-Ascii versions of tokens are same?**
* 



Re: Nested query in SOLR filter query (fq)

2013-07-15 Thread EquilibriumCST
Yes I know about that, but design schema cannot be changed. This is not my
decision :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nested-query-in-SOLR-filter-query-fq-tp4078020p4078047.html
Sent from the Solr - User mailing list archive at Nabble.com.


Facet sorting seems weird

2013-07-15 Thread Henrik Ossipoff Hansen
Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


Re: HTTP Status 503 - Server is shutting down

2013-07-15 Thread PeterKerk
Hi Sandeep,

Thank you for your extensive answer :)
Before I'm going through all your steps, I noticed you mentioning something
about a data import handler.

Now, what I will be requiring after I've completed the basic setup of
Tomcat6 and Solr431 I want to migrate my Solr350 (now running on Cygwin)
cores to that environment.

C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\tt
C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\shop
C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\homes

Will all your steps still apply with my above requirements or is a different
approach needed when migrating from the example-DIH with multiple cores?

Many thanks again! :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/HTTP-Status-503-Server-is-shutting-down-tp4065958p4078059.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Facet sorting seems weird

2013-07-15 Thread David Quarterman
Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


Aggregating data with Solr, getting group stats

2013-07-15 Thread Bojan Šmid
Hi,

  I see there are few ways in Solr which can almost be used for my use
case, but all of them appear to fall short eventually.

  Here is what I am trying to do: consider the following document structure
(there are many more fields in play, but this is enough for example):

Manufacturer
ProductType
Color
Size
Price
CountAvailableItems

  Based on user parameters (search string, some filters), I would fetch a
set of documents. What I need is to group resulting documents by different
attribute combinations (say Manufacturer + Color or ProductType + Color
+ Size or ...) and get stats (Max Price, Avg Price, Num of available
items) for those groups.

  Possible solutions in Solr:

1) StatsComponent - provides all stats I would need, but its grouping
functionality is basic - it can group on a single field (stats.field +
stats.facet) while I need field combinations. There is an issue
https://issues.apache.org/jira/browse/SOLR-2472 which tried to deal with
that, but it looks like it got stuck in the past.

2) Pivot Faceting - seems like it would provide all the grouping logic I
need and in combination with
https://issues.apache.org/jira/browse/SOLR-3583Percentiles for
facets, pivot facets, and distributed pivot facets would
bring percentiles and averages. However, I would still miss things like
Max/Min/Sum and the issue is not committed yet anyway. I would also depend
on another yet to be committed issue
https://issues.apache.org/jira/browse/SOLR-2894 for distributed support.

3) Configurable Collectors -
https://issues.apache.org/jira/browse/SOLR-4465- seems promissing, but
it allows grouping by just one field and, probably
a bigger problem, seem it was just a POC and will need overhauling before
it is anywhere near being ready for commit


  Are there any other options I missed?

  Thanks,

  Bojan


Re: Getting numDocs and pendingDocs in Solr4.3

2013-07-15 Thread Shawn Heisey
On 7/15/2013 3:08 AM, Federico Ragona wrote:
 Hi,
 
 I'm trying to write a validation test that reads some statistics by 
 querying
 Solr 4.3 via HTTP, namely the number of indexed documents (`numDocs`) 
 and the
 number of pending documents (`pendingDocs`) from the Solr4 cluster. I 
 believe
 that in Solr3 there was a `stats.jsp` page thtat offered both numbers.
 
 Is there a way to get both fields in Solr4?

Solr4 should have all the stats that Solr3 has and then some.

If you select your core from the core selector, then click on Plugins /
Stats, click on UPDATEHANDER, then open updateHandler on the right, I
think you'll find at least some of what you were looking for.  Other
parts of what you were looking for might be found on the Overview for
the core.

If you have the default core named collection1 then a URL like this
one will get you there.  You can replace collection1 with the name of
your core.  The /#/ in this URL indicates that it is part of the admin
UI, not something you'd want to query in a program:

http://server:port/solr/#/collection1/plugins/updatehandler?entry=updateHandler

The admin UI gathers most of its core-level information from the mbeans
handler found in the core itself.  The following URL is suitable for
querying in a program.  Note the collection1 in this URL as well:

http://server:port/solr/collection1/admin/mbeans?stats=true

This will default to XML output.  Like most things in Solr, if you add
wt=json to the URL, you'll get JSON format.  You can also add
indent=true for human readability.

Thanks,
Shawn



RE: Facet sorting seems weird

2013-07-15 Thread Henrik Ossipoff Hansen
Hello, thank you for the quick reply!

But given that facet.sort=index just sorts by the faceted index (and I don't 
want the facet itself to be in lower-case), would that really work?

Regards,
Henrik Ossipoff


-Original Message-
From: David Quarterman [mailto:da...@corexe.com] 
Sent: 15. juli 2013 16:46
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


Re: How to change extracted directory

2013-07-15 Thread Shawn Heisey
On 7/15/2013 5:45 AM, wolbi wrote:
 I'm trying to change default tempDir where solr.war file is extracted to. 
 If I change context or webbaps XML it works, but I need to do it from
 commandline and don't know how. I tried to run:
 
 java -Djava.io.tmpdir=/path/to/my/dir -jar start.jar
 
 or 
 
 java -Djavax.servlet.context.tempdir=/path/to/my/dir -jar start.jar
 
 ..without success. I always get default directory:
 
 
 [main] WARN  org.eclipse.jetty.xml.XmlConfiguration  – Config error at Set
 name=tempDirectoryProperty name=jetty.home
 default=.//solr-webapp/Set

The temp directory location is specified by the Solr context fragment
for the example Jetty, which you can find in
example/contexts/solr-jetty-context.xml.  If you have 4.1 or 4.0, that
will be named example/contexts/solr.xml instead.  The easiest thing to
do is to edit that file.  This overrides what you specify on the
commandline.

I saw that you asked this same question in #solr on IRC.  You have to be
patient in an IRC tech channel.  I didn't even see your question until
long after you had disconnected.  It can literally take hours before
anyone is at their keyboard.  According to the time on your email, it's
taken me a few hours for this, too.

Thanks,
Shawn



RE: Facet sorting seems weird

2013-07-15 Thread James Thomas
Hi Henrik,

We did something related to this that I'll share.  I'm rather new to Solr so 
take this idea cautiously :-)
Our requirement was to show exact values but have case-insensitive sorting and 
facet filtering (prefix filtering).

We created an index field (type=string) for creating facets so that the 
values are indexed as-is.
The values we indexed were given the format lowercase value|exact value
So for example, given the value bObles, we would index the string 
bobles|bObles.
When displaying the facet we split the facet value from Solr in half and 
display the second half to the user.
Of course the caveat is that you could have 2 facets that differ only in case, 
but to me that's a data cleansing issue.

James

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: Monday, July 15, 2013 10:57 AM
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hello, thank you for the quick reply!

But given that facet.sort=index just sorts by the faceted index (and I don't 
want the facet itself to be in lower-case), would that really work?

Regards,
Henrik Ossipoff


-Original Message-
From: David Quarterman [mailto:da...@corexe.com] 
Sent: 15. juli 2013 16:46
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


Re: Solr is not responding on deployment in tomcat

2013-07-15 Thread Erick Erickson
Sounds like Wicket and Solr are using the same port(s)...

If you start Wicket first then look at the Solr logs, you might
see some message about port already in use or some such.

If this is SolrCloud, there are also the ZooKeeper ports to
wonder about.

Best
Erick

On Mon, Jul 15, 2013 at 6:49 AM, Per Newgro per.new...@gmx.ch wrote:
 Hi,

 maybe someone here can help me with my solr-4.3.1 issue.

 I've successful deployed the solr.war on a tomcat7 instance.
 Starting the tomcat with only the solr.war deployed - works nicely.
 I can see the admin interface and logs are clean.

 If i
 deploy my wicket-spring-data-solr based app (using the HttpSolrServer)
 after the solr app
 without restarting the tomcat
 = all is fine to.

 I've implemented a ping to see if server is up.

 code
 private void waitUntilSolrIsAvailable(int i) {
 if (i == 0) {
 logger.info(Check solr state...);
 }
 if (i  5) {
 throw new RuntimeException(Solr is not avaliable 
 after more than 25 secs. Going down now.);
 }
 if (i  0) {
 try {
 logger.info(Wait for solr to get alive.);
 Thread.currentThread().wait(5000);
 } catch (InterruptedException e) {
 throw new RuntimeException(e);
 }
 }
 try {
 i++;
 SolrPingResponse r = solrServer.ping();
 if (r.getStatus()  0) {
 waitUntilSolrIsAvailable(i);
 }
 logger.info(Solr is alive.);
 } catch (SolrServerException | IOException e) {
 throw new RuntimeException(e);
 }
 }
 /code

 Here i can see log
 log
 54295 [localhost-startStop-2] INFO  org.apache.wicket.Application  – 
 [wicket.project] init: Wicket extensions initializer
 INFO  - 2013-07-15 12:07:45.261; 
 de.company.service.SolrServerInitializationService; Check solr state...
 54505 [localhost-startStop-2] INFO  
 de.company.service.SolrServerInitializationService  – Check solr state...
 INFO  - 2013-07-15 12:07:45.768; org.apache.solr.core.SolrCore; [collection1] 
 webapp=/solr path=/admin/ping params={wt=javabinversion=2} hits=0 status=0 
 QTime=20
 55012 [http-bio-8080-exec-1] INFO  org.apache.solr.core.SolrCore  – 
 [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} 
 hits=0 status=0 QTime=20
 INFO  - 2013-07-15 12:07:45.770; org.apache.solr.core.SolrCore; [collection1] 
 webapp=/solr path=/admin/ping params={wt=javabinversion=2} status=0 QTime=22
 55014 [http-bio-8080-exec-1] INFO  org.apache.solr.core.SolrCore  – 
 [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} 
 status=0 QTime=22
 INFO  - 2013-07-15 12:07:45.854; 
 de.company.service.SolrServerInitializationService; Solr is alive.
 55098 [localhost-startStop-2] INFO  
 de.company.service.SolrServerInitializationService  – Solr is alive.
 /log

 But if i
 restart the tomcat
 with both webapps (solr and wicket)
 the solr is not responding on the ping request.

 log
 INFO  - 2013-07-15 12:02:27.634; org.apache.wicket.Application; 
 [wicket.project] init: Wicket extensions initializer
 11932 [localhost-startStop-1] INFO  org.apache.wicket.Application  – 
 [wicket.project] init: Wicket extensions initializer
 INFO  - 2013-07-15 12:02:27.787; 
 de.company.service.SolrServerInitializationService; Check solr state...
 12085 [localhost-startStop-1] INFO  
 de.company.service.SolrServerInitializationService  – Check solr state...
 /log

 What could that be or how can i get infos where this is stopping?

 Thanks for your support
 Per


Re: Running Solr in a cluster - high availability only

2013-07-15 Thread Walter Underwood
With only two instances, replication may be the way to go. Or send updates to 
both.

Solr Cloud is much more tightly coupled, requires Zookeeper, etc. There are 
more ways for two Solr Cloud nodes to fail, compared with two Solr nodes using 
old-style replication. In general, a loosely-coupled system will be more 
robust. 

You should look at Solr Cloud if you need sharding or near real time.

wunder

On Jul 15, 2013, at 5:54 AM, Jack Krupansky wrote:

 * Go with SolrCloud - unless you think you're smarter than Yonik and Mark 
 Miller.
 * Replicas are used for both query capacity and resilience (HA).
 * Shards are used for increased index capacity (number of documents) and to 
 reduce query latency (parallel processing of portions of a query.)
 * You need at least three zookeepers for HA. They need to be external to the 
 cluster in production.
 * Load balancing - you need to do your own testing to confirm whether you 
 need it. If so, that is outside of Solr.
 * SolrCloud automatically recovers nodes when they come back up.
 
 -- Jack Krupansky
 
 -Original Message- From: Mysurf Mail
 Sent: Monday, July 15, 2013 8:32 AM
 To: solr-user@lucene.apache.org
 Subject: Running Solr in a cluster - high availability only
 
 Hi,
 I would like to run two Solr instances on different computers as a cluster.
 My main interest is High availability - meaning, in case one server crashes
 or is down there will be always another one.
 
 (my performances on a single instance are great. I do not need to split the
 data to two servers.)
 
 Questions:
 1. What is the best practice?
   Is it different than clustering for index splitting? Do I need Shards?
 2. Do I need zoo keeper?
 3. Is it a container based configuration (different for jetty and tomcat)
 4, Do I need an external NLB for that ?
 5. When one computer is up after crashing. how dows it updates its index? 

--
Walter Underwood
wun...@wunderwood.org





Re: Facet sorting seems weird

2013-07-15 Thread Alexandre Rafalovitch
Hi Henrik,

If I understand the question correctly (case-insensitive sorting of the
facet values), then this is the limitation of the current Facet component.

You can see the full implementation at:
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818

If you are comfortable with Java code, the easiest thing might be to
copy/fix the component and use your own one for faceting. The components
are defined in solrconfig.xml and FacetComponent is in a default chain.
See:
https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194

If you do manage to do this (I would recommend doing it as an extra
option), it would be nice to have it contributed back to Solr. I think you
are not the only one with this requirement.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen 
h...@entertainment-trading.com wrote:

 Hello, first time writing to the list. I am a developer for a company
 where we recently switched all of our search core from Sphinx to Solr with
 very great results. In general we've been very happy with the switch, and
 everything seems to work just as we want it to.

 Today however we've run into a bit of a issue regarding faceted sort.

 For example we have a field called brand in our core, defined as the
 text_en datatype from the example Solr core. This field is copied into
 facet_brand with the datatype string (since we don't really need to do much
 with it except show it for faceted navigation).

 Now, given these two entries into the field on different documents, LEGO
 and bObles, and given facet.sort=index, it appears that LEGO is sorted as
 being before bObles. I assume this is because of casing differences.

 My question then is, how do we define a decent datatype in our schema,
 where the casing is exact, but we are able to sort it without casing
 mattering?

 Thank you :)

 Best regards,
 Henrik Ossipoff



Re: Apache Solr 4 - after 1st commit the index does not grow

2013-07-15 Thread glumet
Ok, I have removed the problem with OutOfMemory by increasing jvm
parameters... and now I have another problem. My index worked since
yesterday evening... the number of documents increased (I run bin/crawl
script every 3 hours and I have 27040 documents now).. but the last increase
was 6 hours ago... why it 
stoped to grow again?

You can look at my solr here:
http://ir-dev.lmcloud.vse.cz:8082/solr/#/~logging

The log says:

java.lang.RuntimeException: [was class java.io.CharConversionException]
Invalid UTF-8 character 0x at char #2800441, byte #3096524)

What is it? how can I solve it? Does anyone have any idea?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Apache-Solr-4-after-1st-commit-the-index-does-not-grow-tp4077913p4078077.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Apache Solr 4 - after 1st commit the index does not grow

2013-07-15 Thread glumet
As I can see, this is the same problem like one from older posts -
http://lucene.472066.n3.nabble.com/strange-utf-8-problem-td3094473.html
...but it was without any response.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Apache-Solr-4-after-1st-commit-the-index-does-not-grow-tp4077913p4078079.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to pass null OR empty values to fq?

2013-07-15 Thread SolrLover
Hi,

I am trying to pass empty values to fq parameter but passing null (or empty)
doesn't seem to work for fq.

Something like...

q=*:*fq=(field1:test OR null)

We are trying to make fq more tolerant by making not fail whenever a
particular variable value is not passed..

Ex:

/select?q=*:*fq=lname:$lname -- lname is empty here and I dont want the
query to fail rather than just do a pass through and return everything
(returned by q). I can't really use swich plugin directly as I have more
number of cases to handle hence I am trying to handle it by creating a
custom component extending the Qparserplugin..



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-pass-null-OR-empty-values-to-fq-tp4078081.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to pass null or empty value to fq?

2013-07-15 Thread SolrLover
Hi, 

I am trying to pass empty values to fq parameter but passing null (or empty)
doesn't seem to work for fq. 

Something like... 

q=*:*fq=(field1:test OR null) 

We are trying to make fq more tolerant.. It shouldn't fail if particular
variable value is not passed.. 

Ex: 

/select?q=*:*fq=lname:$lname -- lname is empty here and I dont want the
query to fail rather than just do a pass through and return everything
(returned by q). I can't really use swich plugin directly as I have more
number of cases to handle hence I am trying to handle it by creating a
custom component extending the Qparserplugin..
  




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-pass-null-or-empty-value-to-fq-tp4078082.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to pass null OR empty values to fq?

2013-07-15 Thread Jack Krupansky
I'm more than a little skeptical about your intentions here... just clean up 
your code and pass clean parameters ONLY!!!


Why is that so difficult?

You should have an application layer between your application client and 
Solr, anyway, so what's the difficulty? I mean, why are you just trying so 
hard just to avoid a few conditional statements in your app layer??


-- Jack Krupansky

-Original Message- 
From: SolrLover

Sent: Monday, July 15, 2013 11:43 AM
To: solr-user@lucene.apache.org
Subject: How to pass null OR empty values to fq?

Hi,

I am trying to pass empty values to fq parameter but passing null (or empty)
doesn't seem to work for fq.

Something like...

q=*:*fq=(field1:test OR null)

We are trying to make fq more tolerant by making not fail whenever a
particular variable value is not passed..

Ex:

/select?q=*:*fq=lname:$lname -- lname is empty here and I dont want the
query to fail rather than just do a pass through and return everything
(returned by q). I can't really use swich plugin directly as I have more
number of cases to handle hence I am trying to handle it by creating a
custom component extending the Qparserplugin..



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-pass-null-OR-empty-values-to-fq-tp4078081.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Replication process on Master/Slave slowing down slave read/search performance

2013-07-15 Thread adityab
Walter, 
Could you provide some more details about your staggered replication
approach?
We are currently running into similar issues and looks like staggered
replication is a better approach to address the performance issues on
Slaves.

thanks
Aditya



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replication-process-on-Master-Slave-slowing-down-slave-read-search-performance-tp707934p4078090.html
Sent from the Solr - User mailing list archive at Nabble.com.


Clearing old nodes from zookeper without restarting solrcloud cluster

2013-07-15 Thread Luis Carlos Guerrero Covo
Hi,

Is there an easy way to clear zookeeper of all offline solr nodes without
restarting the cluster? We are having some stability issues and we think it
maybe due to the leader querying old offline nodes.

thank you,

Luis Guerrero


Re: How to pass null OR empty values to fq?

2013-07-15 Thread SolrLover
Jack,

First, thanks a lot for your response.

We hardcode certain queries directly in search component as its easy for us
to make changes to the query from SOLR side compared to changing in
applications (as many applications - mobile, desktop etc.. use single SOLR
instance). We don't want to change the code which forms the query every time
the query changes rather just changing the query in SOLR should do the
job...Search team controls the boost and other matching criteria hence
search team changes the boost more often without affecting the
application...Now whenever a particular value is not passed in the query, we
are trying to do a pass through so that the entire query doesn't fail (we
pass through only when the custom plugin is used along with the query - for
ex: !optional is the custom plugin that shouldn't throw any error if a value
for any particular variable is not present)...

requestHandler name=find class=solr.SearchHandler default=true
str name=q
(
  _query_:{!dismax qf=lname_i v=$lname}^8.3 OR
  _query_:{!dismax qf=lname_phonetic v=$lname}^8.6
)
(
  _query_:{!optional df='addr' qs=1 v=$where}^6.2 OR
  _query_:{!optional df='addr_i' qs=1 v=$where}^6.2 
)
(
  _query_:{!dismax qf=person_name v=$fname}^3.9 OR
  _query_:{!dismax qf=name_phonetic_i v=$fname}^0.9 OR
)

  /str
/



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Re-How-to-pass-null-OR-empty-values-to-fq-tp4078085p4078094.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Replication process on Master/Slave slowing down slave read/search performance

2013-07-15 Thread Walter Underwood
We ran replication at ten minute intervals. One master, five slaves, and 
replication on the hour on the first slave, ten minutes after the hour on the 
second, twenty minutes after on the third, and so on.

You could do this with a single crontab on the master. Send requests to each 
slave to replicate.

We had a small index (about 250K docs)  that was updated once per day. The 
replication ran every hour, just in case we had to make a mid-day change, which 
did happen sometimes.

wunder

On Jul 15, 2013, at 9:20 AM, adityab wrote:

 Walter, 
 Could you provide some more details about your staggered replication
 approach?
 We are currently running into similar issues and looks like staggered
 replication is a better approach to address the performance issues on
 Slaves.
 
 thanks
 Aditya
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Replication-process-on-Master-Slave-slowing-down-slave-read-search-performance-tp707934p4078090.html
 Sent from the Solr - User mailing list archive at Nabble.com.






Re: Solr caching clarifications

2013-07-15 Thread Manuel Le Normand
Great explanation and article.

Yes, this buffer for merges seems very small, and still optimized. Thats
impressive.


Velocity Example: Where is #url_for_home defined?

2013-07-15 Thread O. Olson
I am new to using Velocity esp. with Solr. In the Velocity example provided,
I am curious where #url_for_home is set i.e. its value assigned? (It is used
a lot in the macros defined in VM_global_library.vm.)

Thank you in advance,
O. O.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Velocity-Example-Where-is-url-for-home-defined-tp4078104.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr 4.3, autocommit, maxdocs

2013-07-15 Thread Jonathan Rochkind
I have a solr 4.3 instance I am in the process of standing up. It 
started out with an empty index.


I have in it's solrconfig.xml,

  updateHandler class=solr.DirectUpdateHandler2
autoCommit
  maxDocs10/maxDocs
  openSearcherfalse/openSearcher
/autoCommit
  updateHandler

I have an index process running, that has currently added around 400k 
documents to Solr.


I had expected that a 'commit' would be run every 100k documents, from 
the above configuration, so 4 commits would have been run by now, and 
I'd see documents in the index.


However, when I look in the Solr admin interface, at my core's 
'overview' page, it still says num docs 0, segment count 0.  When I 
expected num docs 400k at this point.


Is there something I'm misunderstanding about the configuration or the 
admin interface? Or am I right in my expectations, but something else 
must be going wrong?


Thanks for any advice,

Jonathan


Re: solr 4.3, autocommit, maxdocs

2013-07-15 Thread Jason Hellman
Jonathan,

Please note the openSearcher=false part of your configuration.  This is why you 
don't see documents.  The commits are occurring, and being written to segments 
on disk, but they are not visible to the search engine because a Solr searcher 
class has not opened them for visibility.

You can either change the value to true, or alternatively call a deterministic 
commit call at the end of your load (a solr/update?commit=true will default to 
openSearcher=true).  

Hope that's of use!

Jason


On Jul 15, 2013, at 9:52 AM, Jonathan Rochkind rochk...@jhu.edu wrote:

 I have a solr 4.3 instance I am in the process of standing up. It started out 
 with an empty index.
 
 I have in it's solrconfig.xml,
 
  updateHandler class=solr.DirectUpdateHandler2
autoCommit
  maxDocs10/maxDocs
  openSearcherfalse/openSearcher
/autoCommit
  updateHandler
 
 I have an index process running, that has currently added around 400k 
 documents to Solr.
 
 I had expected that a 'commit' would be run every 100k documents, from the 
 above configuration, so 4 commits would have been run by now, and I'd see 
 documents in the index.
 
 However, when I look in the Solr admin interface, at my core's 'overview' 
 page, it still says num docs 0, segment count 0.  When I expected num docs 
 400k at this point.
 
 Is there something I'm misunderstanding about the configuration or the admin 
 interface? Or am I right in my expectations, but something else must be going 
 wrong?
 
 Thanks for any advice,
 
 Jonathan



Change Velocity Template Directory

2013-07-15 Thread O. Olson
Is there any way to change the default Velocity directory where the Velocity
templates are stored? In the example download, I modified the solrconfig.xml
under the Solr Request Handler to add: 

str name=v.base_dirconf/mycustom//str

I have a mycustom directory under the conf directory for the example core,
but I still get the “Unable to find resource 'browse.vm'” exception/error. 

I actually renamed the velocity directory to mycustom. So it has all the
template files that Velocity needs - at least that’s what I figured.

Thank you in advance for any help,
O. O.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Change-Velocity-Template-Directory-tp4078120.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Change Velocity Template Directory

2013-07-15 Thread Erik Hatcher
Try supplying an absolute path.  I'm away from my computer so can't check just 
yet, but it is probably coded to consider that value absolute since moving it 
generally means you want templates outside of your Solr conf/. 

   Erik

On Jul 15, 2013, at 13:25, O. Olson olson_...@yahoo.it wrote:

 Is there any way to change the default Velocity directory where the Velocity
 templates are stored? In the example download, I modified the solrconfig.xml
 under the Solr Request Handler to add: 
 
 str name=v.base_dirconf/mycustom//str
 
 I have a mycustom directory under the conf directory for the example core,
 but I still get the “Unable to find resource 'browse.vm'” exception/error. 
 
 I actually renamed the velocity directory to mycustom. So it has all the
 template files that Velocity needs - at least that’s what I figured.
 
 Thank you in advance for any help,
 O. O.
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Change-Velocity-Template-Directory-tp4078120.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr 4.3, autocommit, maxdocs

2013-07-15 Thread Jonathan Rochkind
Ah, thanks for this explanation. Although I don't entirely understand 
it, I am glad there is an expected explanation!


This Solr instance is actually set up to be a replication master. It 
never gets searched itself, it just replicates to slaves that get searched.


Perhaps some time in the past (I am migrating from an already set up 
Solr 1.4 instance), I set this value to false, figuring it was not 
neccesary to actually open a searcher, since the master does not get 
searched itself ordinarily.


Despite the opensearcher=false... once committed, are the committed docs 
still going to be sent via replication to a slave, is the index used for 
replication actually changed, even though a searcher hasn't been opened 
to take account of it?  Or will the opensearcher=false keep the commits 
from being seen by replication slaves too?


Thanks for any tips,

Jonathan

On 7/15/13 12:57 PM, Jason Hellman wrote:

Jonathan,

Please note the openSearcher=false part of your configuration.  This is why you 
don't see documents.  The commits are occurring, and being written to segments 
on disk, but they are not visible to the search engine because a Solr searcher 
class has not opened them for visibility.

You can either change the value to true, or alternatively call a deterministic 
commit call at the end of your load (a solr/update?commit=true will default to 
openSearcher=true).

Hope that's of use!

Jason


On Jul 15, 2013, at 9:52 AM, Jonathan Rochkind rochk...@jhu.edu wrote:


I have a solr 4.3 instance I am in the process of standing up. It started out 
with an empty index.

I have in it's solrconfig.xml,

  updateHandler class=solr.DirectUpdateHandler2
autoCommit
  maxDocs10/maxDocs
  openSearcherfalse/openSearcher
/autoCommit
  updateHandler

I have an index process running, that has currently added around 400k documents 
to Solr.

I had expected that a 'commit' would be run every 100k documents, from the 
above configuration, so 4 commits would have been run by now, and I'd see 
documents in the index.

However, when I look in the Solr admin interface, at my core's 'overview' page, 
it still says num docs 0, segment count 0.  When I expected num docs 400k at 
this point.

Is there something I'm misunderstanding about the configuration or the admin 
interface? Or am I right in my expectations, but something else must be going 
wrong?

Thanks for any advice,

Jonathan




Example for DIH data source through query string

2013-07-15 Thread Kiran J
Hi,

I want to dynamically specify the data source in the URL when invoking data
import handler. I'm looking at this :

http://wiki.apache.org/solr/DataImportHandler#solrconfigdatasource

  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandlerlst
name=defaults  str
name=config/home/username/data-config.xml/str  lst
name=datasource str
name=drivercom.mysql.jdbc.Driver/str str
name=urljdbc:mysql://localhost/dbname/str str
name=userdb_username/str str
name=passworddb_password/str  /lst/lst
/requestHandler


Can anyone give me a good example ?

ie http://localhost:8983/solr/dataimport?datasource=what goes here ?

Your help is much appreciated.

Thanks


Re: Doc's FunctionQuery result field in my custom SearchComponent class ?

2013-07-15 Thread Tony Mullins
any help plz !!!


On Mon, Jul 15, 2013 at 4:13 PM, Tony Mullins tonymullins...@gmail.comwrote:

 Please any help on how to get the value of 'freq' field in my custom
 SearchComponent ?


 http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

 docstr name=id11/strstr name=typeVideo Games/strstr
 name=formatxbox 360/strstr name=productThe Amazing
 Spider-Man/strint name=popularity11/intlong
 name=_version_1439994081345273856/longint name=freq1/int/doc



 Here is my code

 DocList docs = rb.getResults().docList;
 DocIterator iterator = docs.iterator();
 int sumFreq = 0;
 String id = null;

 for (int i = 0; i  docs.size(); i++) {
 try {
 int docId = iterator.nextDoc();

// Document doc = searcher.doc(docId, fieldSet);
 Document doc = searcher.doc(docId);

 In doc object I can see the schema fields like 'id', 'type','format' etc.
 but I cannot find the field 'freq' which I needed. Is there any way to get
 the FunctionQuery fields in doc object ?

 Thanks,
 Tony



 On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.comwrote:

 Hi,

 I have extended Solr's SearchComonent class and I am iterating through
 all the docs in ResponseBuilder in @overrider Process() method.

 Here I want to get the value of FucntionQuery result but in Document
 object I am only seeing the standard field of document not the
 FucntionQuery result.

 This is my query


 http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

 Result of above query in browser shows me that 'freq' is part of doc
 but its not there in Document object in my @overrider Process() method.

 How can I get the value of FunctionQuery result in my custom
 SearchComponent ?

 Thanks,
 Tony





Re: Example for DIH data source through query string

2013-07-15 Thread Alexandre Rafalovitch
I don't think you can get there from here.

But you can specify config file on a query line. If you only have a couple
of configurations, you could have them in different files and switch that
way.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jul 15, 2013 at 2:56 PM, Kiran J kiranjuni...@gmail.com wrote:

 Hi,

 I want to dynamically specify the data source in the URL when invoking data
 import handler. I'm looking at this :

 http://wiki.apache.org/solr/DataImportHandler#solrconfigdatasource

   requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandlerlst
 name=defaults  str
 name=config/home/username/data-config.xml/str  lst
 name=datasource str
 name=drivercom.mysql.jdbc.Driver/str str
 name=urljdbc:mysql://localhost/dbname/str str
 name=userdb_username/str str
 name=passworddb_password/str  /lst/lst
 /requestHandler


 Can anyone give me a good example ?

 ie http://localhost:8983/solr/dataimport?datasource=what goes here ?

 Your help is much appreciated.

 Thanks



Re: Velocity Example: Where is #url_for_home defined?

2013-07-15 Thread Erik Hatcher
#url_for_home is defined in conf/velocity/VM_global_library.vm.  Note that
it builds upon #url_root defined just above it, so maybe that's what you
want to adjust if you need to tinker with it.

Erik

On Jul 15, 2013, at 12:49, O. Olson olson_...@yahoo.it wrote:

 I am new to using Velocity esp. with Solr. In the Velocity example provided,
 I am curious where #url_for_home is set i.e. its value assigned? (It is used
 a lot in the macros defined in VM_global_library.vm.)
 
 Thank you in advance,
 O. O.
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Velocity-Example-Where-is-url-for-home-defined-tp4078104.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Doc's FunctionQuery result field in my custom SearchComponent class ?

2013-07-15 Thread Patanachai Tangchaisin

Hi,

I think the process of retrieving a stored field (through fl) is happens
after SearchComponent.

One solution: If you wrap a q params with function your score will be a
result of the function.
For example,

http://localhost:8080/solr/collection2/demoendpoint?q=termfreq%28product,%27spider%27%29wt=xmlindent=truefl=*,score


Now your score is going to be a result of termfreq(product,'spider')


--
Patanachai Tangchaisin


On 07/15/2013 12:01 PM, Tony Mullins wrote:

any help plz !!!


On Mon, Jul 15, 2013 at 4:13 PM, Tony Mullins tonymullins...@gmail.comwrote:


Please any help on how to get the value of 'freq' field in my custom
SearchComponent ?


http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

docstr name=id11/strstr name=typeVideo Games/strstr
name=formatxbox 360/strstr name=productThe Amazing
Spider-Man/strint name=popularity11/intlong
name=_version_1439994081345273856/longint name=freq1/int/doc



Here is my code

DocList docs = rb.getResults().docList;
 DocIterator iterator = docs.iterator();
 int sumFreq = 0;
 String id = null;

 for (int i = 0; i  docs.size(); i++) {
 try {
 int docId = iterator.nextDoc();

// Document doc = searcher.doc(docId, fieldSet);
 Document doc = searcher.doc(docId);

In doc object I can see the schema fields like 'id', 'type','format' etc.
but I cannot find the field 'freq' which I needed. Is there any way to get
the FunctionQuery fields in doc object ?

Thanks,
Tony



On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.comwrote:


Hi,

I have extended Solr's SearchComonent class and I am iterating through
all the docs in ResponseBuilder in @overrider Process() method.

Here I want to get the value of FucntionQuery result but in Document
object I am only seeing the standard field of document not the
FucntionQuery result.

This is my query


http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

Result of above query in browser shows me that 'freq' is part of doc
but its not there in Document object in my @overrider Process() method.

How can I get the value of FunctionQuery result in my custom
SearchComponent ?

Thanks,
Tony






CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.



MorphlineSolrSink

2013-07-15 Thread Rajesh Jain
Newbie question:

I have a Flume server, where I am writing to sink which is a RollingFile
Sink.

I have to take this files from the sink and send it to Solr which can index
and provide search.

Do I need to configure MorphineSolrSink?

What is the mechanism's to do this or send this data over to Solr.

Thanks,
Rajesh


Different 'fl' for first X results

2013-07-15 Thread Weber
How to get a different field list in the first X results? For example, in the
first 5 results I want fields A, B, C, and on the next results I need only
fields A, and B.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different 'fl' for first X results

2013-07-15 Thread Alexandre Rafalovitch
It is not really possible.  Why do you actually need it?

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jul 15, 2013 at 4:58 PM, Weber solrmaill...@fluidolabs.com wrote:

 How to get a different field list in the first X results? For example, in
 the
 first 5 results I want fields A, B, C, and on the next results I need only
 fields A, and B.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Solr 4.3.1: Errors When Attempting to Index LatLon Fields

2013-07-15 Thread Scott Vanderbilt
I'm trying to index documents containing geo-spatial coordinates using 
Solr 4.3.1 and am running into some difficulties. Whenever I attempt to 
index a particular document containing a geospatial coordinate pair 
(using post.jar), the operation fails as follows:


  SimplePostTool version 1.5
  Posting files to base url http://localhost:8080/solr/update using
  content-type application/xml..
  POSTing file rib1.xml
  SimplePostTool: WARNING: Solr returned an error #400 Bad Request
  SimplePostTool: WARNING: IOException while reading response:
 java.io.IOException: Server returned HTTP response code: 400 for
 URL: http://localhost:8080/solr/update
  1 files indexed.
  COMMITting Solr index changes to http://localhost:8080/solr/update..
  Time spent: 0:00:00.063

The solr log shows the following:

  08:30:39 ERROR SolrCore org.apache.solr.common.SolrException:
undefined field: geoFindspot_0_coordinate

There relevant parts of my schema.xml are:

  field name=geoFindspot type=location indexed=true
 stored=true multiValued=true/
  ...
  fieldType name=location class=solr.LatLonType
  subFieldSuffix=_coordinate/
  dynamicField name=*_coordinate type=tdouble indexed=true
  stored=false /

The document I am attempting to index has this field:

   field name=geoFindspot51.512332,-0.090588/field

As far as I can tell, my configuration complies with the instructions on 
the relevant Wiki page (http://wiki.apache.org/solr/SpatialSearch) and I 
can see nothing amiss.


Any suggestions as to why this is failing would be greatly appreciated. 
Thank you!




Re: Velocity Example: Where is #url_for_home defined?

2013-07-15 Thread O. Olson
Thank you very much Erik. That’s exactly what I was looking for. I can swear
I looked into VM_global_library.vm. I'm not sure how I missed it :-(
O. O.


Erik Hatcher-4 wrote
 #url_for_home is defined in conf/velocity/VM_global_library.vm.  Note that
 it builds upon #url_root defined just above it, so maybe that's what you
 want to adjust if you need to tinker with it.
 
 Erik





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Velocity-Example-Where-is-url-for-home-defined-tp4078104p4078186.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different 'fl' for first X results

2013-07-15 Thread Jack Krupansky
1. Request all fields needed for all results and simply ignore the extra 
field(s) (which can be empty or missing and will automatically be ignored by 
Solr anyway).

2. Two separate query requests.
3. A custom search component.
4. Wait for the new scripted query request handler that gives you full 
control in a custom script.


-- Jack Krupansky

-Original Message- 
From: Weber

Sent: Monday, July 15, 2013 4:58 PM
To: solr-user@lucene.apache.org
Subject: Different 'fl' for first X results

How to get a different field list in the first X results? For example, in 
the

first 5 results I want fields A, B, C, and on the next results I need only
fields A, and B.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Change Velocity Template Directory

2013-07-15 Thread O. Olson
Thank you Erik. I did not think the Windows file/directory path format would
work for Solr. For others the following worked for me:
str
name=v.base_dirC:\Users\MyUsername\Solr\example\example-DIH\solr\db\conf\mycustom\/str



Erik Hatcher-4 wrote
 Try supplying an absolute path.  I'm away from my computer so can't check
 just yet, but it is probably coded to consider that value absolute since
 moving it generally means you want templates outside of your Solr conf/. 
 
Erik





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Change-Velocity-Template-Directory-tp4078120p4078188.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.3.1: Errors When Attempting to Index LatLon Fields

2013-07-15 Thread Jack Krupansky

Make sure that dynamicFields are within fields rather than types.

Solr tends to ignore misplaced configuration elements.

-- Jack Krupansky

-Original Message- 
From: Scott Vanderbilt 
Sent: Monday, July 15, 2013 5:10 PM 
To: solr-user@lucene.apache.org 
Subject: Solr 4.3.1: Errors When Attempting to Index LatLon Fields 

I'm trying to index documents containing geo-spatial coordinates using 
Solr 4.3.1 and am running into some difficulties. Whenever I attempt to 
index a particular document containing a geospatial coordinate pair 
(using post.jar), the operation fails as follows:


  SimplePostTool version 1.5
  Posting files to base url http://localhost:8080/solr/update using
  content-type application/xml..
  POSTing file rib1.xml
  SimplePostTool: WARNING: Solr returned an error #400 Bad Request
  SimplePostTool: WARNING: IOException while reading response:
 java.io.IOException: Server returned HTTP response code: 400 for
 URL: http://localhost:8080/solr/update
  1 files indexed.
  COMMITting Solr index changes to http://localhost:8080/solr/update..
  Time spent: 0:00:00.063

The solr log shows the following:

  08:30:39 ERROR SolrCore org.apache.solr.common.SolrException:
undefined field: geoFindspot_0_coordinate

There relevant parts of my schema.xml are:

  field name=geoFindspot type=location indexed=true
 stored=true multiValued=true/
  ...
  fieldType name=location class=solr.LatLonType
  subFieldSuffix=_coordinate/
  dynamicField name=*_coordinate type=tdouble indexed=true
  stored=false /

The document I am attempting to index has this field:

   field name=geoFindspot51.512332,-0.090588/field

As far as I can tell, my configuration complies with the instructions on 
the relevant Wiki page (http://wiki.apache.org/solr/SpatialSearch) and I 
can see nothing amiss.


Any suggestions as to why this is failing would be greatly appreciated. 
Thank you!


Re: Solr 4.3.1: Errors When Attempting to Index LatLon Fields

2013-07-15 Thread Scott Vanderbilt

Brilliant. That's precisely what the issue was.

The Wiki didn't give a context for where the dynamicField element was 
supposed to go and I assumed (incorrectly) that it was in types. Of 
course, I should not have assumed that and verified it independently. 
Mea culpa.


Thanks the gentle application of the clue stick. g



On 7/15/2013 2:25 PM, Jack Krupansky wrote:

Make sure that dynamicFields are within fields rather than types.

Solr tends to ignore misplaced configuration elements.

-- Jack Krupansky

-Original Message- From: Scott Vanderbilt Sent: Monday, July 15,
2013 5:10 PM To: solr-user@lucene.apache.org Subject: Solr 4.3.1: Errors
When Attempting to Index LatLon Fields
I'm trying to index documents containing geo-spatial coordinates using
Solr 4.3.1 and am running into some difficulties. Whenever I attempt to
index a particular document containing a geospatial coordinate pair
(using post.jar), the operation fails as follows:

   SimplePostTool version 1.5
   Posting files to base url http://localhost:8080/solr/update using
   content-type application/xml..
   POSTing file rib1.xml
   SimplePostTool: WARNING: Solr returned an error #400 Bad Request
   SimplePostTool: WARNING: IOException while reading response:
  java.io.IOException: Server returned HTTP response code: 400 for
  URL: http://localhost:8080/solr/update
   1 files indexed.
   COMMITting Solr index changes to http://localhost:8080/solr/update..
   Time spent: 0:00:00.063

The solr log shows the following:

   08:30:39 ERROR SolrCore org.apache.solr.common.SolrException:
 undefined field: geoFindspot_0_coordinate

There relevant parts of my schema.xml are:

   field name=geoFindspot type=location indexed=true
  stored=true multiValued=true/
   ...
   fieldType name=location class=solr.LatLonType
   subFieldSuffix=_coordinate/
   dynamicField name=*_coordinate type=tdouble indexed=true
   stored=false /

The document I am attempting to index has this field:

field name=geoFindspot51.512332,-0.090588/field

As far as I can tell, my configuration complies with the instructions on
the relevant Wiki page (http://wiki.apache.org/solr/SpatialSearch) and I
can see nothing amiss.

Any suggestions as to why this is failing would be greatly appreciated.
Thank you!





Re: ACL implementation: Pseudo-join performance Atomic Updates

2013-07-15 Thread Roman Chyla
On Sun, Jul 14, 2013 at 1:45 PM, Oleg Burlaca oburl...@gmail.com wrote:

 Hello Erick,

  Join performance is most sensitive to the number of values
  in the field being joined on. So if you have lots and lots of
  distinct values in the corpus, join performance will be affected.
 Yep, we have a list of unique Id's that we get by first searching for
 records
 where loggedInUser IS IN (userIDs)
 This corpus is stored in memory I suppose? (not a problem) and then the
 bottleneck is to match this huge set with the core where I'm searching?

 Somewhere in maillist archive people were talking about external list of
 Solr unique IDs
 but didn't find if there is a solution.
 Back in 2010 Yonik posted a comment:
 http://find.searchhub.org/document/363a4952446b3cd#363a4952446b3cd


sorry, haven't the previous thread in its entirety, but few weeks back that
Yonik's proposal got implemented, it seems ;)

http://search-lucene.com/m/Fa3Dg14mqoj/bitsetsubj=Re+Solr+large+boolean+filter

You could use this to send very large bitset filter (which can be
translated into any integers, if you can come up with a mapping function).

roman



  bq: I suppose the delete/reindex approach will not change soon
  There is ongoing work (search the JIRA for Stacked Segments)
 Ah, ok, I was feeling it affects the architecture, ok, now the only hope is
 Pseudo-Joins ))

  One way to deal with this is to implement a post filter, sometimes
 called
  a no cache filter.
 thanks, will have a look, but as you describe it, it's not the best option.

 The approach
 too many documents, man. Please refine your query. Partial results below
 means faceting will not work correctly?

 ... I have in mind a hybrid approach, comments welcome:
 Most of the time users are not searching, but browsing content, so our
 virtual filesystem stored in SOLR will use only the index with the Id of
 the file and the list of users that have access to it. i.e. not touching
 the fulltext index at all.

 Files may have metadata (EXIF info for images for ex) that we'd like to
 filter by, calculate facets.
 Meta will be stored in both indexes.

 In case of a fulltext query:
 1. search FT index (the fulltext index), get only the number of search
 results, let it be Rf
 2. search DAC index (the index with permissions), get number of search
 results, let it be Rd

 let maxR be the maximum size of the corpus for the pseudo-join.
 *That was actually my question: what is a reasonable number? 10, 100, 1000
 ?
 *

 if (Rf  maxR) or (Rd  maxR) then use the smaller corpus to join onto the
 second one.
 this happens when (only a few documents contains the search query) OR (user
 has access to a small number of files).

 In case none of these happens, we can use the
 too many documents, man. Please refine your query. Partial results below
 but first searching the FT index, because we want relevant results first.

 What do you think?

 Regards,
 Oleg




 On Sun, Jul 14, 2013 at 7:42 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  Join performance is most sensitive to the number of values
  in the field being joined on. So if you have lots and lots of
  distinct values in the corpus, join performance will be affected.
 
  bq: I suppose the delete/reindex approach will not change soon
 
  There is ongoing work (search the JIRA for Stacked Segments)
  on actually doing something about this, but it's been under
 consideration
  for at least 3 years so your guess is as good as mine.
 
  bq: notice that the worst situation is when everyone has access to all
 the
  files, it means the first filter will be the full index.
 
  One way to deal with this is to implement a post filter, sometimes
 called
  a no cache filter. The distinction here is that
  1 it is not cached (duh!)
  2 it is only called for documents that have made it through all the
   other lower cost filters (and the main query of course).
  3 lower cost means the filter is either a standard, cached filters
  and any no cache filters with a cost (explicitly stated in the
 query)
  lower than this one's.
 
  Critically, and unlike normal filter queries, the result set is NOT
  calculated for all documents ahead of time
 
  You _still_ have to deal with the sysadmin doing a *:* query as you
  are well aware. But one can mitigate that by having the post-filter
  fail all documents after some arbitrary N, and display a message in the
  app like too many documents, man. Please refine your query. Partial
  results below. Of course this may not be acceptable, but
 
  HTH
  Erick
 
  On Sun, Jul 14, 2013 at 12:05 PM, Jack Krupansky
  j...@basetechnology.com wrote:
   Take a look at LucidWorks Search and its access control:
  
 
 http://docs.lucidworks.com/display/help/Search+Filters+for+Access+Control
  
   Role-based security is an easier nut to crack.
  
   Karl Wright of ManifoldCF had a Solr patch for document access control
 at
   one point:
   SOLR-1895 - ManifoldCF SearchComponent plugin for enforcing 

Re: Different 'fl' for first X results

2013-07-15 Thread Alexandre Rafalovitch
Is there a JIRA number for the last one?

Regards,
 Alex
On 15 Jul 2013 17:21, Jack Krupansky j...@basetechnology.com wrote:

 1. Request all fields needed for all results and simply ignore the extra
 field(s) (which can be empty or missing and will automatically be ignored
 by Solr anyway).
 2. Two separate query requests.
 3. A custom search component.
 4. Wait for the new scripted query request handler that gives you full
 control in a custom script.

 -- Jack Krupansky

 -Original Message- From: Weber
 Sent: Monday, July 15, 2013 4:58 PM
 To: solr-user@lucene.apache.org
 Subject: Different 'fl' for first X results

 How to get a different field list in the first X results? For example, in
 the
 first 5 results I want fields A, B, C, and on the next results I need only
 fields A, and B.



 --
 View this message in context: http://lucene.472066.n3.**
 nabble.com/Different-fl-for-**first-X-results-tp4078178.htmlhttp://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Different 'fl' for first X results

2013-07-15 Thread Jack Krupansky

SOLR-5005 - JavaScriptRequestHandler
https://issues.apache.org/jira/browse/SOLR-5005

-- Jack Krupansky

-Original Message- 
From: Alexandre Rafalovitch

Sent: Monday, July 15, 2013 6:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Different 'fl' for first X results

Is there a JIRA number for the last one?

Regards,
Alex
On 15 Jul 2013 17:21, Jack Krupansky j...@basetechnology.com wrote:


1. Request all fields needed for all results and simply ignore the extra
field(s) (which can be empty or missing and will automatically be ignored
by Solr anyway).
2. Two separate query requests.
3. A custom search component.
4. Wait for the new scripted query request handler that gives you full
control in a custom script.

-- Jack Krupansky

-Original Message- From: Weber
Sent: Monday, July 15, 2013 4:58 PM
To: solr-user@lucene.apache.org
Subject: Different 'fl' for first X results

How to get a different field list in the first X results? For example, in
the
first 5 results I want fields A, B, C, and on the next results I need only
fields A, and B.



--
View this message in context: http://lucene.472066.n3.**
nabble.com/Different-fl-for-**first-X-results-tp4078178.htmlhttp://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html
Sent from the Solr - User mailing list archive at Nabble.com.





Changes in DIrectSpellChecker configuration cause hang on startup

2013-07-15 Thread Brendan Grainger
Hi All,

I changed the name of the queryAnalyzerFieldType for my spellcheck
component and the corresponding field and now when solr starts up, it hangs
at this point:

5797 [searcherExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  –
QuerySenderListener sending requests to
Searcher@153d12bfmain{StandardDirectoryReader(segments_k9p:127340
_1cz(4.3):C387286/120
_2u1(4.3):C405320/146 _4pl(4.3):C493017/136 _65a(4.3):C322122/160
_7ky(4.3):C312296/147 _936(4.3):C326967/135 _b9j(4.3):C474140/229
_cyy(4.3):C298811/88428 _124m(4.3):C622322/137649

My config for the spellcheckcomponent:

searchComponent name=spellcheck class=solr.SpellCheckComponent

str name=queryAnalyzerFieldTypemarkup/str

!-- Multiple Spell Checkers can be declared and used by this
 component
  --

!-- a spellchecker built from a field of the main index --
lst name=spellchecker
  str name=namedefault/str
  str name=fieldmarkup_texts/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  !-- the spellcheck distance measure used, the default is the
internal levenshtein --
  str name=distanceMeasureinternal/str
  !-- minimum accuracy needed to be considered a valid spellcheck
suggestion --
  float name=accuracy0.5/float
  !-- the maximum #edits we consider when enumerating terms: can be 1
or 2 --
  int name=maxEdits1/int
  !-- the minimum shared prefix when enumerating terms --
  int name=minPrefix1/int
  !-- maximum number of inspections per result. --
  int name=maxInspections5/int
  !-- minimum length of a query term to be considered for correction
--
  int name=minQueryLength4/int
  !-- maximum threshold of documents a query term can appear to be
considered for correction --
  float name=maxQueryFrequency0.01/float
  !-- uncomment this to require suggestions to occur in 1% of the
documents
  float name=thresholdTokenFrequency.01/float
  --
/lst

Has anyone got some insight?

Thanks


How to use joins in solr 4.3.1

2013-07-15 Thread Utkarsh Sengar
Hello,

I am trying to join data between two cores: merchant and location

This is my query:
http://_server_.com:8983/solr/location/select?q={!join from=merchantId
to=merchantId fromIndex=merchant}walgreens
Ref: http://wiki.apache.org/solr/Join


Merchants core has documents for the query: walgreens with an merchantId 1
A simple query: http://_server_.com:8983/solr/location/select?q=walgreens
returns documents called walgreens with merchantId=1

Location core has documents with merchantId=1 too.

But my join query returns no documents.

This is the response I get:
{
  responseHeader:{
status:0,
QTime:5,
params:{
  debugQuery:true,
  indent:true,
  q:{!join from=merchantId to=merchantId
fromIndex=merchant}walgreens,
  wt:json}},
  response:{numFound:0,start:0,maxScore:0.0,docs:[]
  },
  debug:{
rawquerystring:{!join from=merchantId to=merchantId
fromIndex=merchant}walgreens,
querystring:{!join from=merchantId to=merchantId
fromIndex=merchant}walgreens,
parsedquery:JoinQuery({!join from=merchantId to=merchantId
fromIndex=merchant}allText:walgreens),
parsedquery_toString:{!join from=merchantId to=merchantId
fromIndex=merchant}allText:walgreens,
QParser:,
explain:{}}}


Any suggestions?


-- 
Thanks,
-Utkarsh


Re: How to use joins in solr 4.3.1

2013-07-15 Thread Utkarsh Sengar
I have also tried these queries (as per this SO answer:
http://stackoverflow.com/questions/12665797/is-solr-4-0-capable-of-using-join-for-multiple-core
)

1. http://_server_.com:8983/solr/location/select?q=:fq={!join
from=merchantId to=merchantId fromIndex=merchant}walgreens

And I get this:

{
  responseHeader:{
status:400,
QTime:1,
params:{
  indent:true,
  q::,
  wt:json,
  fq:{!join from=merchantId to=merchantId
fromIndex=merchant}walgreens}},
  error:{
msg:org.apache.solr.search.SyntaxError: Cannot parse ':':
Encountered \ \:\ \: \\ at line 1, column 0.\nWas expecting one
of:\nNOT ...\n\+\ ...\n\-\ ...\nBAREOPER ...\n
   \(\ ...\n\*\ ...\nQUOTED ...\nTERM ...\n
PREFIXTERM ...\nWILDTERM ...\nREGEXPTERM ...\n\[\
...\n\{\ ...\nLPARAMS ...\nNUMBER ...\nTERM
...\n\*\ ...\n,
code:400}}

And this:
2.http://_server_.com:8983/solr/location/select?q=walgreensfq={!join
from=merchantId to=merchantId fromIndex=merchant}

{
  responseHeader:{
status:500,
QTime:5,
params:{
  indent:true,
  q:walgreens,
  wt:json,
  fq:{!join from=merchantId to=merchantId fromIndex=merchant}}},
  error:{
msg:Server at http://_SERVER_:8983/solr/location returned non
ok status:500, message:Server Error,

trace:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Server at http://_SERVER_:8983/solr/location returned non ok
status:500, message:Server Error\n\tat
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)\n\tat
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)\n\tat
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156)\n\tat
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)\n\tat
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:138)\n\tat
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)\n\tat
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:138)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)\n\tat
java.lang.Thread.run(Thread.java:662)\n,
code:500}}

Thanks,
-Utkarsh



On Mon, Jul 15, 2013 at 4:27 PM, Utkarsh Sengar utkarsh2...@gmail.comwrote:

 Hello,

 I am trying to join data between two cores: merchant and location

 This is my query:
 http://_server_.com:8983/solr/location/select?q={!join from=merchantId
 to=merchantId fromIndex=merchant}walgreens
 Ref: http://wiki.apache.org/solr/Join


 Merchants core has documents for the query: walgreens with an merchantId
 1
  A simple query: http://_server_.com:8983/solr/location/select?q=walgreens
 returns documents called walgreens with merchantId=1

 Location core has documents with merchantId=1 too.

 But my join query returns no documents.

 This is the response I get:
 {
   responseHeader:{
 status:0,
 QTime:5,
 params:{
   debugQuery:true,
   indent:true,
   q:{!join from=merchantId to=merchantId
 fromIndex=merchant}walgreens,
   wt:json}},
   response:{numFound:0,start:0,maxScore:0.0,docs:[]
   },
   debug:{
 rawquerystring:{!join from=merchantId to=merchantId
 fromIndex=merchant}walgreens,
 querystring:{!join from=merchantId to=merchantId
 fromIndex=merchant}walgreens,
 parsedquery:JoinQuery({!join from=merchantId to=merchantId
 fromIndex=merchant}allText:walgreens),
 parsedquery_toString:{!join from=merchantId to=merchantId
 fromIndex=merchant}allText:walgreens,
 QParser:,
 explain:{}}}


 Any suggestions?


 --
 Thanks,
 -Utkarsh




-- 
Thanks,
-Utkarsh


Re: SolrCloud: how to index documents into a specific core and how to search against that core?

2013-07-15 Thread Jie Sun
Yandong,
have you figured out if it works for you to use one collection per customer? 

We have the similar use-case as yours: customer id's are used as core names.

that was the reason our company did not upgrade to solrcould ... I might
remember it wrong but I vaguely remember I looked into using collection for
each customer, and it seems the number of collections as current release are
fixes, aren't they?

thanks
Jie



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-how-to-index-documents-into-a-specific-core-and-how-to-search-against-that-core-tp3985262p4078210.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud: Collection API question and problem with core loading

2013-07-15 Thread Patrick Mi
Hi there,

I run 2 solr instances ( Tomcat 7, Solr 4.3.0 , one shard),one external
Zookeeper instance and have lots of cores. 

I use collection API to create the new core dynamically after the
configuration for the core is uploaded to the Zookeeper and it all works
fine.

As there are so many cores it takes very long time to load them at start up
I would like to start up the server quickly and load the cores on demand.

When the core is created via collection API it is created with default
parameter : loadOnStartup=true ( this can be seen in solr.xml )

Question: is there a way to specify this parameter so it can be set 'false'
in collection API ?  

Problem: If I manually set loadOnStartup=true for the core I had exception
below when I used CloudSolrServer to query the core : 
Error: org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request  

Seems to me that CloudSolrServer will not trigger the core to be loaded. 

Is it possible to get the core loaded using CloudSolrServer?

Regards,
Patrick




Re: Example for DIH data source through query string

2013-07-15 Thread Kiran J
Thank you Alex.


On Mon, Jul 15, 2013 at 12:37 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 I don't think you can get there from here.

 But you can specify config file on a query line. If you only have a couple
 of configurations, you could have them in different files and switch that
 way.

 Regards,
Alex.

 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


 On Mon, Jul 15, 2013 at 2:56 PM, Kiran J kiranjuni...@gmail.com wrote:

  Hi,
 
  I want to dynamically specify the data source in the URL when invoking
 data
  import handler. I'm looking at this :
 
  http://wiki.apache.org/solr/DataImportHandler#solrconfigdatasource
 
requestHandler name=/dataimport
  class=org.apache.solr.handler.dataimport.DataImportHandlerlst
  name=defaults  str
  name=config/home/username/data-config.xml/str  lst
  name=datasource str
  name=drivercom.mysql.jdbc.Driver/str str
  name=urljdbc:mysql://localhost/dbname/str str
  name=userdb_username/str str
  name=passworddb_password/str  /lst/lst
  /requestHandler
 
 
  Can anyone give me a good example ?
 
  ie http://localhost:8983/solr/dataimport?datasource=what goes here ?
 
  Your help is much appreciated.
 
  Thanks
 



Book contest idea - feedback requested

2013-07-15 Thread Alexandre Rafalovitch
Hello,

Packt Publishing has kindly agreed to let me run a contest with e-copies of
my book as prizes:
http://www.packtpub.com/apache-solr-for-indexing-data/book

Since my book is about learning Solr and targeted at beginners and early
intermediates, here is what I would like to do. I am asking for feedback on
whether people on the mailing list like the idea or have specific
objections to it.

1) The basic idea is to get Solr users and write and vote on what they find
hard with Solr, especially in understanding the features (as contrasted
with just missing ones).
2) I'll probably set it up as a User Voice forum, which has all the
mechanisms for suggesting and voting on ideas. With an easier interface
than Jira
3) The top N voted ideas will get the books as prizes and I will try to
fix/document/create JIRAs for those issues.
4) I am hoping to specifically reach out to the communities where Solr is a
component and where they don't necessarily hang out on our mailing list. I
am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS,
SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from this
list would be absolutely welcome to participate as well.

Yes? No? Suggestions?

Also, if you are maintainer of one of the products/services/libraries that
has Solr in it and want to reach out to your community yourself, I think it
would be a lot better than If I did it. Contact me directly and I will let
you know what template/FAQ I want you to include in the announcement
message when it is ready.

Thank you all in advance for the comments and suggestions.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


Re: MorphlineSolrSink

2013-07-15 Thread Israel Ekpo
Rajesh,

I think this question is better suited for the FLUME user mailing list.

You will need to configure the sink with the expected values so that the
events from the channels can head to the right place.

On Mon, Jul 15, 2013 at 4:49 PM, Rajesh Jain rjai...@gmail.com wrote:

 Newbie question:

 I have a Flume server, where I am writing to sink which is a RollingFile
 Sink.

 I have to take this files from the sink and send it to Solr which can index
 and provide search.

 Do I need to configure MorphineSolrSink?

 What is the mechanism's to do this or send this data over to Solr.

 Thanks,
 Rajesh




-- 
°O°
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/


Re: Clearing old nodes from zookeper without restarting solrcloud cluster

2013-07-15 Thread Luis Carlos Guerrero Covo
I know that you can clear zookeeper's data directoy using the CLI with the
clear command, I just want to know if its possible to update the cluster's
state without wiping everything out. Anyone have any ideas/suggestions?


On Mon, Jul 15, 2013 at 11:21 AM, Luis Carlos Guerrero Covo 
lcguerreroc...@gmail.com wrote:

 Hi,

 Is there an easy way to clear zookeeper of all offline solr nodes without
 restarting the cluster? We are having some stability issues and we think it
 maybe due to the leader querying old offline nodes.

 thank you,

 Luis Guerrero




-- 
Luis Carlos Guerrero Covo
M.S. Computer Engineering
(57) 3183542047


Re: Book contest idea - feedback requested

2013-07-15 Thread Ali, Saqib
Hello Alex,

This sounds like an excellent idea! :)

Saqib


On Mon, Jul 15, 2013 at 8:11 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 Hello,

 Packt Publishing has kindly agreed to let me run a contest with e-copies of
 my book as prizes:
 http://www.packtpub.com/apache-solr-for-indexing-data/book

 Since my book is about learning Solr and targeted at beginners and early
 intermediates, here is what I would like to do. I am asking for feedback on
 whether people on the mailing list like the idea or have specific
 objections to it.

 1) The basic idea is to get Solr users and write and vote on what they find
 hard with Solr, especially in understanding the features (as contrasted
 with just missing ones).
 2) I'll probably set it up as a User Voice forum, which has all the
 mechanisms for suggesting and voting on ideas. With an easier interface
 than Jira
 3) The top N voted ideas will get the books as prizes and I will try to
 fix/document/create JIRAs for those issues.
 4) I am hoping to specifically reach out to the communities where Solr is a
 component and where they don't necessarily hang out on our mailing list. I
 am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS,
 SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from this
 list would be absolutely welcome to participate as well.

 Yes? No? Suggestions?

 Also, if you are maintainer of one of the products/services/libraries that
 has Solr in it and want to reach out to your community yourself, I think it
 would be a lot better than If I did it. Contact me directly and I will let
 you know what template/FAQ I want you to include in the announcement
 message when it is ready.

 Thank you all in advance for the comments and suggestions.

 Regards,
Alex.

 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)



Re: Clearing old nodes from zookeper without restarting solrcloud cluster

2013-07-15 Thread Ali, Saqib
Hello Luis,

I don't think that is possible. If you delete clusterstate.json from
zookeeper, you will need to restart the nodes.. I could be very wrong
about this

Saqib


On Mon, Jul 15, 2013 at 8:50 PM, Luis Carlos Guerrero Covo 
lcguerreroc...@gmail.com wrote:

 I know that you can clear zookeeper's data directoy using the CLI with the
 clear command, I just want to know if its possible to update the cluster's
 state without wiping everything out. Anyone have any ideas/suggestions?


 On Mon, Jul 15, 2013 at 11:21 AM, Luis Carlos Guerrero Covo 
 lcguerreroc...@gmail.com wrote:

  Hi,
 
  Is there an easy way to clear zookeeper of all offline solr nodes without
  restarting the cluster? We are having some stability issues and we think
 it
  maybe due to the leader querying old offline nodes.
 
  thank you,
 
  Luis Guerrero
 



 --
 Luis Carlos Guerrero Covo
 M.S. Computer Engineering
 (57) 3183542047



Re: MorphlineSolrSink

2013-07-15 Thread Ashish
On Tue, Jul 16, 2013 at 2:19 AM, Rajesh Jain rjai...@gmail.com wrote:

 Newbie question:

 I have a Flume server, where I am writing to sink which is a RollingFile
 Sink.

 I have to take this files from the sink and send it to Solr which can index
 and provide search.

 Do I need to configure MorphineSolrSink?


Yes



 What is the mechanism's to do this or send this data over to Solr.


More details here
http://flume.apache.org/FlumeUserGuide.html#morphlinesolrsink

As suggested, please move further related question to Flume User ML.



 Thanks,
 Rajesh




-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal


Re: Facet sorting seems weird

2013-07-15 Thread William Bell
Alex,

You could submit a JIRA ticket, and add an option like facet.sort =
insensitive, and f. syntax

Then we all get the benefit of the new feature.



On Mon, Jul 15, 2013 at 9:16 AM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 Hi Henrik,

 If I understand the question correctly (case-insensitive sorting of the
 facet values), then this is the limitation of the current Facet component.

 You can see the full implementation at:

 https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818

 If you are comfortable with Java code, the easiest thing might be to
 copy/fix the component and use your own one for faceting. The components
 are defined in solrconfig.xml and FacetComponent is in a default chain.
 See:

 https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194

 If you do manage to do this (I would recommend doing it as an extra
 option), it would be nice to have it contributed back to Solr. I think you
 are not the only one with this requirement.

 Regards,
Alex.

 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


 On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen 
 h...@entertainment-trading.com wrote:

  Hello, first time writing to the list. I am a developer for a company
  where we recently switched all of our search core from Sphinx to Solr
 with
  very great results. In general we've been very happy with the switch, and
  everything seems to work just as we want it to.
 
  Today however we've run into a bit of a issue regarding faceted sort.
 
  For example we have a field called brand in our core, defined as the
  text_en datatype from the example Solr core. This field is copied into
  facet_brand with the datatype string (since we don't really need to do
 much
  with it except show it for faceted navigation).
 
  Now, given these two entries into the field on different documents,
 LEGO
  and bObles, and given facet.sort=index, it appears that LEGO is sorted
 as
  being before bObles. I assume this is because of casing differences.
 
  My question then is, how do we define a decent datatype in our schema,
  where the casing is exact, but we are able to sort it without casing
  mattering?
 
  Thank you :)
 
  Best regards,
  Henrik Ossipoff
 




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Book contest idea - feedback requested

2013-07-15 Thread Sandeep Gupta
Hi Alex,

great please go ahead..

-Sandeep


On Tue, Jul 16, 2013 at 9:40 AM, Ali, Saqib docbook@gmail.com wrote:

 Hello Alex,

 This sounds like an excellent idea! :)

 Saqib


 On Mon, Jul 15, 2013 at 8:11 PM, Alexandre Rafalovitch
 arafa...@gmail.comwrote:

  Hello,
 
  Packt Publishing has kindly agreed to let me run a contest with e-copies
 of
  my book as prizes:
  http://www.packtpub.com/apache-solr-for-indexing-data/book
 
  Since my book is about learning Solr and targeted at beginners and early
  intermediates, here is what I would like to do. I am asking for feedback
 on
  whether people on the mailing list like the idea or have specific
  objections to it.
 
  1) The basic idea is to get Solr users and write and vote on what they
 find
  hard with Solr, especially in understanding the features (as contrasted
  with just missing ones).
  2) I'll probably set it up as a User Voice forum, which has all the
  mechanisms for suggesting and voting on ideas. With an easier interface
  than Jira
  3) The top N voted ideas will get the books as prizes and I will try to
  fix/document/create JIRAs for those issues.
  4) I am hoping to specifically reach out to the communities where Solr
 is a
  component and where they don't necessarily hang out on our mailing list.
 I
  am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS,
  SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from
 this
  list would be absolutely welcome to participate as well.
 
  Yes? No? Suggestions?
 
  Also, if you are maintainer of one of the products/services/libraries
 that
  has Solr in it and want to reach out to your community yourself, I think
 it
  would be a lot better than If I did it. Contact me directly and I will
 let
  you know what template/FAQ I want you to include in the announcement
  message when it is ready.
 
  Thank you all in advance for the comments and suggestions.
 
  Regards,
 Alex.
 
  Personal website: http://www.outerthoughts.com/
  LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
  - Time is the quality of nature that keeps events from happening all at
  once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
 



Re: Doc's FunctionQuery result field in my custom SearchComponent class ?

2013-07-15 Thread Tony Mullins
No sorry, I am still not getting the termfreq() field in my 'doc' object.
I do get the _version_ field in my 'doc' object which I think is
realValue=StoredField.

At which point termfreq() or any other FunctionQuery field becomes the part
of doc object in Solr ? And at that point can I perform some custom logic
and append the response ?

Thanks.
Tony





On Tue, Jul 16, 2013 at 1:34 AM, Patanachai Tangchaisin 
patanachai.tangchai...@wizecommerce.com wrote:

 Hi,

 I think the process of retrieving a stored field (through fl) is happens
 after SearchComponent.

 One solution: If you wrap a q params with function your score will be a
 result of the function.
 For example,

 http://localhost:8080/solr/**collection2/demoendpoint?q=**
 termfreq%28product,%27spider%**27%29wt=xmlindent=truefl=*,**scorehttp://localhost:8080/solr/collection2/demoendpoint?q=termfreq%28product,%27spider%27%29wt=xmlindent=truefl=*,score


 Now your score is going to be a result of termfreq(product,'spider')


 --
 Patanachai Tangchaisin



 On 07/15/2013 12:01 PM, Tony Mullins wrote:

 any help plz !!!


 On Mon, Jul 15, 2013 at 4:13 PM, Tony Mullins tonymullins...@gmail.com*
 *wrote:

  Please any help on how to get the value of 'freq' field in my custom
 SearchComponent ?


 http://localhost:8080/solr/**collection2/demoendpoint?q=**
 spiderwt=xmlindent=truefl=***,freq:termfreq%28product,%**
 27spider%27%29http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

 docstr name=id11/strstr name=typeVideo Games/strstr
 name=formatxbox 360/strstr name=productThe Amazing
 Spider-Man/strint name=popularity11/int**long
 name=_version_**1439994081345273856/longint
 name=freq1/int/doc



 Here is my code

 DocList docs = rb.getResults().docList;
  DocIterator iterator = docs.iterator();
  int sumFreq = 0;
  String id = null;

  for (int i = 0; i  docs.size(); i++) {
  try {
  int docId = iterator.nextDoc();

 // Document doc = searcher.doc(docId, fieldSet);
  Document doc = searcher.doc(docId);

 In doc object I can see the schema fields like 'id', 'type','format' etc.
 but I cannot find the field 'freq' which I needed. Is there any way to
 get
 the FunctionQuery fields in doc object ?

 Thanks,
 Tony



 On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.com
 **wrote:

  Hi,

 I have extended Solr's SearchComonent class and I am iterating through
 all the docs in ResponseBuilder in @overrider Process() method.

 Here I want to get the value of FucntionQuery result but in Document
 object I am only seeing the standard field of document not the
 FucntionQuery result.

 This is my query


 http://localhost:8080/solr/**collection2/demoendpoint?q=**
 spiderwt=xmlindent=truefl=***,freq:termfreq%28product,%**
 27spider%27%29http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29

 Result of above query in browser shows me that 'freq' is part of doc
 but its not there in Document object in my @overrider Process() method.

 How can I get the value of FunctionQuery result in my custom
 SearchComponent ?

 Thanks,
 Tony




 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.