Re: Solr(j) API for manipulating the schema(.xml)?

2014-09-20 Thread Jorge Luis Betancourt Gonzalez
Basically you could create a bunch of dynamic fields (according to your needs) 
so basically creating a dynamic field for each type of data (and several 
combinations) and then you can create a small wrapper around Solrj that will 
wrap the patterns defined on your schema.xml in a more understandable way. Like 
this you will be able to abstract the manipulation of the schema.xml file and 
only introduce it when is really needed i.e a new field type with new 
analyzers, etc. 

On Sep 18, 2014, at 3:16 AM, Clemens Wyss DEV  wrote:

> as our framework so far only knows a few field types "dynamic field"s may be 
> the way to go... And if there are new fieldtypes the new schema can be 
> distributed through ZooKeeper
> 
> -Ursprüngliche Nachricht-
> Von: Erick Erickson [mailto:erickerick...@gmail.com] 
> Gesendet: Mittwoch, 17. September 2014 19:56
> An: solr-user@lucene.apache.org
> Betreff: Re: Solr(j) API for manipulating the schema(.xml)?
> 
> Right, you can create new cores over the rest api.
> 
> As far as changing the schema, there's no good way to do that that I know of 
> programmatically. In the SolrCloud world, you can upload the schema to 
> ZooKeeper and have it automatically distributed to all the nodes though.
> 
> Best,
> Erick
> 
> On Wed, Sep 17, 2014 at 2:28 AM, Clemens Wyss DEV  
> wrote:
>> Is there an API to manipulate/consolidate the schema(.xml) of a Solr-core? 
>> Through SolrJ?
>> 
>> Context:
>> We already have a generic indexing/searching framework (based on lucene) 
>> where any component can act as a so called IndexDataPorvider. This provider 
>> delivers the field-types and also the entities to be (converted into 
>> documents and then) indexed. Each of these IndexProviders has ist own lucene 
>> index.
>> So we kind of have the information for the Solr schema.xml.
>> 
>> Hope the intention is clear. And yes the manipulation of the schema.xml is 
>> basically only needed when the field types change. Thats why I am looking 
>> for a way to consolidate the schema.xml (upon boot, initialization oft he 
>> IndexDataProviders ...).
>> In 99,999% it won't change, But I'd like to keep the possibility of an 
>> IndexDataProvider to hand in "its schema".
>> 
>> Also, again driven by the dynamic nature of our framework, can I easily 
>> create new cores over Sorj or the Solr-REST API ?

Concurso "Mi selfie por los 5". Detalles en 
http://justiciaparaloscinco.wordpress.com


[ANNOUNCE] Apache Gora 0.5 Release

2014-09-20 Thread lewis john mcgibbney
Hi Folks,
Apologies for cross posting.
The Apache Gora team are pleased to announce the immediate availability of
Apache Gora 0.5.

The Apache Gora open source framework provides an in-memory data model and
persistence for big data. Gora supports persisting to column stores, key
value stores, document stores and RDBMSs, and analyzing the data with
extensive Apache Hadoop™ MapReduce support. Gora uses the Apache Software
License v2.0.

This release addresses no fewer than 44 issues [0] with many being
improvements and new functionality. Most notably the release includes the
addition of a new module for MongoDB, Shim ffunctionality to support
multiple Hadoop versions, improved authentication for Accumulo, better
documentation for many modules, and pluggable solrj implementations
supporting a default value of http for HttpSolrServer. Available options
include http (HttpSolrServer), cloud (CloudSolrServer), concurrent
(ConcurrentUpdateSolrServer) and loadbalance (LBHttpSolrServer).

Suggested Gora database support is as follows

   - Apache Avro 1.7.6
   - Apache Hadoop 1.0.1 and 2.4.0
   - Apache HBase 0.94.14
   - Apache Cassandra 2.0.2
   - Apache Solr 4.8.1
   - MongoDB 2.6
   - Apache Accumlo 1.5.1

Gora is released as both source code, downloads for which can be found at
our downloads page [1] as well as Maven artifacts which can be found on
Maven central [2].

Thank you

Lewis

(on behalf of the Apache Gora PMC)

[0] http://s.apache.org/0.5report
[1] http://gora.apache.org/downloads.html
[2] http://search.maven.org/#search|ga|1|gora


-- 

` :
:   , :
 #+`. ,,`,
` ;##`  .`,.  ;;':;`
 `` ##@.;.;: ,;+;;;';;';;';'`
  ```,###:  .,;; +;;'';;+;;;';;`
```#+##'``;+ '';;;'';;';;;';;;`
 ```,##+#@:: ''';';;';+;;';;':::+:
   ```.#'';';+;;';';';;';;';;':,;:
 '#+#+#';';''';;';';;';;';'::
  ;;:';,##''';'';;';';;'';;;'::';;;':.```
`.,`;;;++';'';;';'';;';;;';;'::';;:;';;;::
:`,.,.`:';+#+;;''';'';';';;';;';;';;;'::;';:;.
   .`..;,:`';;';';;;'+#+';;''+';;';:'';;';';;;':::;,:`
` ,`:. ;;;';';;;++#+'';''';''+;;';;';::';';;:..
  ` `` ;;;';';';';;'+###+';';'';;';;';;';;';;;';;',:.
  ` `  `;:;;';';';;;'+;';';;';;';;';;';;'';;';';::;
   
`.;,:::;::;';';;'#++''';;';;;'';+';:::''::;;..:

```:,'::,;';';;;';;;''##+++'';;';;';;;''';;':,,,:.:,.`

```..::,;';:;';';';;;';';';';'''++###+'+;';;;';;;';;:;.:..:..,

,;;:;:;';''';''++##+++.:..:.,;
`

`.``,,:,';;::;;::';';;;';';;';';;';';;';;';';';'++#+###@#++:...,,.;:.

`:.';.,;;',,;;;';';;';;':;;;';';;';;';';';;';;;''.:,:.,:'#@'::,

```.:,';;.::':';';',;;;';;':;';;';;';;;';;';'';;.;.,.:..,:.::

``:::',:;';;,:;;',:';';;':';';;;';;'::';;;,..,.,.,:+`

`..:'+:';;',;';,:;:';;;,,';::,';;',,';;.:.:;,

``,.';;:':,;:;,,:;:::``..,:,``

:`;;`

``: ,:`







http://people.apache.org/~lewismc || @hectorMcSpector ||
http://www.linkedin.com/in/lmcgibbney

Apache Gora V.P || Apache Nutch PMC || Apache Any23 V.P || Apache OODT PMC ||
 Apache Open Climate Workbench PMC || Apache Tika PMC || Apache TAC


Re: Help on custom sort

2014-09-20 Thread Erick Erickson
How many different groups are there? And can user A ever be part of
more than one group?
If
1> there are a reasonably small number of groups (< 100 or so as a
place to start)
and
2> a user is always part of a single group

then you could store separate prices in each document by group, thus
you'd have some fields like
price_group_a: $100
price_group_b: $101

then sorting  becomes trivial, you just specify a sort_group_a for
users in group A etc. If the number of groups is unknown-but-not-huge
dynamic fields could be used.

If that's not the case, then you might be able to get clever with
sorting by function, here's a place to start:
https://cwiki.apache.org/confluence/display/solr/Function+Queries

These can be arbitrarily complex, but I'm thinking something where the
price returned by the function respects the group the user is in,
perhaps even the min/max of all the groups the user is in. I admit I
haven't really thought that through well though...

Best,
Erick

On Sat, Sep 20, 2014 at 9:26 AM, Scott Smith  wrote:
> I need to provide a custom sort option for sorting by price and I would like 
> some suggestions.  It's not the straightforward "just sort by a price field 
> in the document" scenario or I wouldn't be asking for help.  Here's the 
> scenario I'm dealing with.
>
> I have 100 million+ documents (so multi-sharded).  Users search for documents 
> they are interested in using a standard keyword search.  They then purchase 
> documents they are interested in.  So far, nothing hard.
>
> Here's where things get "interesting".  The documents come from multiple 
> suppliers.  Each supplier sets a price for his documents and different 
> suppliers will provide different pricing.
>
> That wouldn't be difficult except that *users* are divided up into different 
> groups and depending on which group they are in, the supplier will charge the 
> user a different price.  So, user A may pay one price for a document and user 
> B may pay a different price for the same document just because user A and 
> user B are in different groups.  I don't even know if the relative order or 
> pricing is the same between different groups (e.g., if document X is more 
> expensive than document Y for a user in group M, it may not be more expensive 
> for a user in group N).  The one thing that may make this doable is that 
> supplier A will likely have the same price for all of his documents for each 
> of the user groups.  So, a user in group A will pay the same price regardless 
> of which document he buys from supplier 1.  A user in group B will also pay 
> the same price for any document from supplier 1; it's just that a user in 
> group B will likely pay a different price than a user in group A.  So, within 
> a supplier, the price varies based on user group, not the document.
>
> To summarize, one of the requirements for the system is that we provide the 
> ability to sort search results based on price.  This would be easy except 
> that the price a user pays not only depends on what he wants to buy, but on 
> what group the he is in.
>
> I suspect there is some kind of custom solr module I'm going to have to 
> write.  I'm thinking that the user group gets passed in as a custom solr 
> parameter (I'm assuming that's possible??).  Then I'm thinking that there has 
> to be some kind of in memory database that tracks pricing based on user group 
> and document supplier).
>
> I'm happy to go read code, documents, links, etc if someone can point me in 
> the right direction.  What kind of solr module am I likely going to write 
> (extend) and are there some examples somewhere?  Maybe there's a way to do 
> this without having to extend a solr module??
>
> Hope this makes sense.  Any help is appreciated.
>
> Scott
>
>


Re: Will commit/softcommit invalid filtercache?

2014-09-20 Thread Erick Erickson
This should help, about 1/3 of the way down are the answers to your
specific questions...

https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Sat, Sep 20, 2014 at 2:38 AM, forest_soup  wrote:
> Hi, all.
>
> We have some questions of commit/softcommit and cache.
> We understand that a softcommit will create a new searcher. Will the
> filtercache be invalid after a softcommit is done?
>
> And also for commit, if we do commit with openSearcher, will the filtercache
> be invalid?
>
> Thanks!
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Will-commit-softcommit-invalid-filtercache-tp4160153.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Help on custom sort

2014-09-20 Thread Scott Smith
I need to provide a custom sort option for sorting by price and I would like 
some suggestions.  It's not the straightforward "just sort by a price field in 
the document" scenario or I wouldn't be asking for help.  Here's the scenario 
I'm dealing with.

I have 100 million+ documents (so multi-sharded).  Users search for documents 
they are interested in using a standard keyword search.  They then purchase 
documents they are interested in.  So far, nothing hard.

Here's where things get "interesting".  The documents come from multiple 
suppliers.  Each supplier sets a price for his documents and different 
suppliers will provide different pricing.

That wouldn't be difficult except that *users* are divided up into different 
groups and depending on which group they are in, the supplier will charge the 
user a different price.  So, user A may pay one price for a document and user B 
may pay a different price for the same document just because user A and user B 
are in different groups.  I don't even know if the relative order or pricing is 
the same between different groups (e.g., if document X is more expensive than 
document Y for a user in group M, it may not be more expensive for a user in 
group N).  The one thing that may make this doable is that supplier A will 
likely have the same price for all of his documents for each of the user 
groups.  So, a user in group A will pay the same price regardless of which 
document he buys from supplier 1.  A user in group B will also pay the same 
price for any document from supplier 1; it's just that a user in group B will 
likely pay a different price than a user in group A.  So, within a supplier, 
the price varies based on user group, not the document.

To summarize, one of the requirements for the system is that we provide the 
ability to sort search results based on price.  This would be easy except that 
the price a user pays not only depends on what he wants to buy, but on what 
group the he is in.

I suspect there is some kind of custom solr module I'm going to have to write.  
I'm thinking that the user group gets passed in as a custom solr parameter (I'm 
assuming that's possible??).  Then I'm thinking that there has to be some kind 
of in memory database that tracks pricing based on user group and document 
supplier).

I'm happy to go read code, documents, links, etc if someone can point me in the 
right direction.  What kind of solr module am I likely going to write (extend) 
and are there some examples somewhere?  Maybe there's a way to do this without 
having to extend a solr module??

Hope this makes sense.  Any help is appreciated.

Scott




Re: Error Instantiating UpdateRequestProcessorFactory

2014-09-20 Thread Allistair C
I’ve found the issue.

- First, my IDE was putting all the Solr JAR dependencies into my custom JAR. I 
noticed the JAR was 14MB when it should have been a few Kb. I changed this to 
get a JAR with only my classes in.

- I then ran into CNFEs of the Solr UpdateRequestProcessorFactory and 
UpdateRequestProcessor classes. This was because I was adding my JAR to 
Tomcat’s lib folder where they are loaded before the solr web app’s libs, so it 
was not finding the dependencies. By moving my JAR into the solr web app 
WEB-INF/lib this issue is resolved.

Cheers

On 20 Sep 2014, at 05:30, Shalin Shekhar Mangar  wrote:

> Sounds like a class loader issue. Try adding your jar to $SOLR_HOME/lib
> instead of tomcat lib.
> 
> Also, upgrade to Solr 4.x, 3.6 is ancient! :)
> 
> On Sat, Sep 20, 2014 at 1:13 AM, Allistair C  wrote:
> 
>> Hi all,
>> 
>> I’m in a bit of a cul de sac with an issue, hope you can help.
>> 
>> I am creating a custom UpdateRequestProcessor. The Solr documentation
>> details that I need to write a factory class subclassing
>> UpdateRequestProcessorFactory and this should return an instance of my
>> class that subclasses UpdateRequestProcessor.
>> 
>> I have done this, and I have created a JAR.
>> 
>> I have deployed the JAR into Tomcat’s lib folder where Solr is running.
>> 
>> I have modified the solrconfig to include my class correctly.
>> 
>> On startup Solr finds my class but does not believe it conforms to being a
>> UpdateRequestProcessorFactory.
>> 
>> SEVERE: org.apache.solr.common.SolrException: Error Instantiating
>> UpdateRequestProcessorFactory,
>> com.acme.solr.update.processor.URLRewriteUpdateRequestProcessorFactory is
>> not a org.apache.solr.update.processor.UpdateRequestProcessorFactory
>>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:421)
>> 
>> Things I have tried:
>> 
>> - Ensured that I am compiling my JAR with the exact JDK that is running
>> Solr.
>> - Downloaded Solr 3.6.2 and copied one of the Solr built-in processors,
>> renamed it, compiled it and tried to use it - SAME issue.
>> - Created a test that uses the same code as the Solr code that is failing
>> (namely, isAssignableFrom):
>> 
>>Class clazz =
>> Class.forName("com.acme.solr.update.processor.URLRewriteUpdateRequestProcessorFactory");
>>boolean isA =
>> UpdateRequestProcessorFactory.class.isAssignableFrom(clazz);
>>System.out.println(isA);
>> 
>> Print’s “true” - i.e. it’s perfectly OK!
>> 
>> I include my simple processor here:
>> 
>> package com.acme.solr.update.processor;
>> 
>> public class URLRewriteUpdateRequestProcessorFactory extends
>> UpdateRequestProcessorFactory
>> {
>>@Override
>>public UpdateRequestProcessor getInstance(SolrQueryRequest req,
>> SolrQueryResponse rsp, UpdateRequestProcessor next) {
>>return new URLRewriteProcessor(next);
>>}
>> }
>> 
>> class URLRewriteProcessor extends UpdateRequestProcessor
>> {
>>public URLRewriteProcessor(UpdateRequestProcessor next)
>>{
>>super(next);
>>}
>> 
>>@Override
>>public void processAdd(AddUpdateCommand cmd) throws IOException
>>{
>>SolrInputDocument doc = cmd.getSolrInputDocument();
>>doc.setField("foo", "bar");
>> 
>>super.processAdd(cmd);
>>}
>> }
>> 
>> At this point I am at a loss and would appreciate any assistance or ideas
>> to try.
>> 
>> Cheers
> 
> 
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.



Will commit/softcommit invalid filtercache?

2014-09-20 Thread forest_soup
Hi, all.

We have some questions of commit/softcommit and cache.
We understand that a softcommit will create a new searcher. Will the
filtercache be invalid after a softcommit is done? 

And also for commit, if we do commit with openSearcher, will the filtercache
be invalid?

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Will-commit-softcommit-invalid-filtercache-tp4160153.html
Sent from the Solr - User mailing list archive at Nabble.com.