Re: Document removed from index after Delta import

2016-08-21 Thread Erick Erickson
You can't rely on any searches with that autocommit configuration.

openSearcher is set to false. Therefore you will not see any changes to
your index as a result of an expiring autoCommit interval.

I'm not sure whether DIH issues its own commit when done, but your
tests so far aren't particularly dependable. I'd try again after issuing
a manual hard commit or a soft commit.

You can do this in a URL by
/collection/update?commit=true

Here's the whole run-down on the various commit options.
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Sun, Aug 21, 2016 at 7:43 AM, Or Gerson  wrote:
> Hello,
>
> I have Solr version 4.3.0.
>
> I have encountered a problem where document is not returning from queries
> after delta import although the delta import does not report that a
> document has been deleted.
>
> i have a document that is composed of several fields , the delta import
> looks for a field
> called "update_date" and checks whether the date is after the last time
> delta import was ran.
>
> so something like
>
> select d.ID as UNIQUE_ID
>from document d
>where d.UPDATE_DATE >= '${dataimporter.last_index_time}' and
> d.OWNER_ID IS NOT NULL and d.DELETE_DATE IS NULL
>
>
> deletePKQuery looks like
>
> select d.ID as UNIQUE_ID from document d
>   where d.DELETE_DATE >= '${dataimporter.last_index_time}'
>
> doing full import will successfully fetch the documents
>
> then changing the update_date on a document will remove it from the index
> (but this is not reported in the log)
>
> "07 Aug 2016 18:29:24,278 [Thread-53] INFO  DocBuilder - Completed
> DeletedRowKey for Entity: permission_set rows obtained : 0"
>
> 07 Aug 2016 18:29:24,438 [Thread-53] INFO  MetadataOnImportEndEventListener
> - metadata import process end: {deletedDocCount=0, docCount=1,
> queryCount=6, rowCount=6, skipDocCount=0}
>
>
> merge policy is :
>
> 
>   10
>   10
> 
>
>
> 
> 
>  1
> 
>  1
>
>
> only hard commit is configured, soft commit is commented out:
>
> 
>   1000
>   3
>   false
> 
>
>
> will greatly appreciate your help
>
> Thanks,
> Or Gerson


Document removed from index after Delta import

2016-08-21 Thread Or Gerson
Hello,

I have Solr version 4.3.0.

I have encountered a problem where document is not returning from queries
after delta import although the delta import does not report that a
document has been deleted.

i have a document that is composed of several fields , the delta import
looks for a field
called "update_date" and checks whether the date is after the last time
delta import was ran.

so something like

select d.ID as UNIQUE_ID
   from document d
   where d.UPDATE_DATE >= '${dataimporter.last_index_time}' and
d.OWNER_ID IS NOT NULL and d.DELETE_DATE IS NULL


deletePKQuery looks like

select d.ID as UNIQUE_ID from document d
  where d.DELETE_DATE >= '${dataimporter.last_index_time}'

doing full import will successfully fetch the documents

then changing the update_date on a document will remove it from the index
(but this is not reported in the log)

"07 Aug 2016 18:29:24,278 [Thread-53] INFO  DocBuilder - Completed
DeletedRowKey for Entity: permission_set rows obtained : 0"

07 Aug 2016 18:29:24,438 [Thread-53] INFO  MetadataOnImportEndEventListener
- metadata import process end: {deletedDocCount=0, docCount=1,
queryCount=6, rowCount=6, skipDocCount=0}


merge policy is :


  10
  10





 1

 1


only hard commit is configured, soft commit is commented out:


  1000
  3
  false



will greatly appreciate your help

Thanks,
Or Gerson


Re: Solr 6.1.0, zookeeper 3.4.8, Solrj and SolrCloud

2016-08-21 Thread danny teichthal
Hi,
Not sure if it is related, but could be - I see that you do this =
CloudSolrClient
solrClient = new
CloudSolrClient.Builder().withZkHost(zkHosts).build();
Are you creating a new client on each update?
If yes, pay attention that the Solr Client should be a singleton.

Regarding session timeout, what value did you set - zkClientTimeout?
The parameter - maxSessionTimeout, controls this time out on zookeeper side.
zkClientTimeout - controls your client timeout.


Session expiry can also be affected from:
1. Garbage collection on Solr node/Zookeeper.
2. Slow IO on disk.
3. Network latency.

You should check these metrics on your system at the time you got this
expiry to see if it might be related.
If your zkClientTimeout  is set to a small value in addition to one of the
factors above - you could get many of these exceptions.






On Thu, Aug 18, 2016 at 6:51 PM, Narayana B  wrote:

> Hi SolrTeam,
>
> I see session exipre and my solr index fails.
>
> please help me here, my infra details are shared below
>
> I have total 3 compute
> nodes[pcam-stg-app-02,pcam-stg-app-03,pcam-stg-app-04]
>
> 1) 3 nodes are running with zoo1, zoo2, zoo3 instances
>
> /apps/scm-core/zookeeper/zkData/zkData1/myid  value 1
> /apps/scm-core/zookeeper/zkData/zkData2/myid  value 2
> /apps/scm-core/zookeeper/zkData/zkData3/myid  value 3
>
> zoo1.cfg my setup
>
> tickTime=2000
> initLimit=5
> syncLimit=2
> dataDir=/apps/scm-core/zookeeper/zkData/zkData1
> clientPort=2181
> server.1=pcam-stg-app-01:2888:3888
> server.2=pcam-stg-app-02:2888:3888
> server.3=pcam-stg-app-03:2888:3888
> server.4=pcam-stg-app-04:2888:3888
> dataLogDir=/apps/scm-core/zookeeper/zkLogData/zkLogData1
> # Default 64M, changed to 128M, represented in KiloBytes
> preAllocSize=131072
> # Default : 10
> snapCount=100
> globalOutstandingLimit=1000
> maxClientCnxns=100
> autopurge.snapRetainCount=3
> autopurge.purgeInterval=23
> minSessionTimeout=4
> maxSessionTimeout=30
>
> [zk: pcam-stg-app-02:2181(CONNECTED) 0] ls /
> [zookeeper, solr]
> [zk: pcam-stg-app-02:2181(CONNECTED) 1] ls /solr
> [configs, overseer, aliases.json, live_nodes, collections, overseer_elect,
> security.json, clusterstate.json]
>
>
>
> 2) 2 nodes are running solrcloud
> pcam-stg-app-03: solr port 8983, solr port 8984
> pcam-stg-app-04: solr port 8983, solr port 8984
>
>
> Config upload to zookeeper
>
> server/scripts/cloud-scripts/zkcli.sh -zkhost
> pcam-stg-app-02:2181,pcam-stg-app-03:2181,pcam-stg-app-04:2181/solr \
> -cmd upconfig -confname scdata -confdir
> /apps/scm-core/solr/solr-6.1.0/server/solr/configsets/data_
> driven_schema_configs/conf
>
> Collection creation url:
>
> http://pcam-stg-app-03:8983/solr/admin/collections?action=
> CREATE=scdata_test=2=
> 2=2=pcam-stg-app-03:
> 8983_solr,pcam-stg-app-03:8984_solr,pcam-stg-app-04:
> 8983_solr,pcam-stg-app-04:8984_solr=scdata
>
> solrj client
>
>
> String zkHosts =
> "pcam-stg-app-02:2181,pcam-stg-app-03:2181,pcam-stg-app-04:2181/solr";
> CloudSolrClient solrClient = new
> CloudSolrClient.Builder().withZkHost(zkHosts).build();
> solrClient.setDefaultCollection("scdata_test");
> solrClient.setParallelUpdates(true);
>
> List cpnSpendSavingsList = new ArrayList<>();
> i have done data setter to cpnSpendSavingsList
>
> solrClient.addBeans(cpnSpendSavingsList);
> solrClient.commit();
>
>
>
>
> SessionExpire Error for the collections
>
> Why this SessionExpire error comes when i start bulk insert/update to solr
>
>
> org.apache.solr.common.SolrException: Could not load collection from ZK:
> scdata_test
> at
> org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(
> ZkStateReader.java:1047)
> at
> org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(
> ZkStateReader.java:610)
> at
> org.apache.solr.common.cloud.ClusterState.getCollectionOrNull(
> ClusterState.java:211)
> at
> org.apache.solr.common.cloud.ClusterState.hasCollection(
> ClusterState.java:113)
> at
> org.apache.solr.client.solrj.impl.CloudSolrClient.getCollectionNames(
> CloudSolrClient.java:1239)
> at
> org.apache.solr.client.solrj.impl.CloudSolrClient.
> requestWithRetryOnStaleState(CloudSolrClient.java:961)
> at
> org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> CloudSolrClient.java:934)
> at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
> at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:106)
> at
> org.apache.solr.client.solrj.SolrClient.addBeans(SolrClient.java:357)
> at
> org.apache.solr.client.solrj.SolrClient.addBeans(SolrClient.java:329)
> at
> com.cisco.pcam.spark.stream.HiveDataProcessStream.main(
> HiveDataProcessStream.java:165)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> 62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(