[jira] [Comment Edited] (GORA-386) Gora Spark Backend Support

2015-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721270#comment-14721270
 ] 

Lewis John McGibbney edited comment on GORA-386 at 8/29/15 10:01 PM:
-

OK folks. I have reviewed the PR and I am quite happy to commit this unless 
there are objections. Nice work [~kamaci] and [~talat].


was (Author: lewismc):
OK folks. I have reviewed the PR and I am quite happy to commit this unless 
there are objections. Nice work [~kamaci].

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.6.1
>
> Attachments: connection_refused.txt
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-386) Gora Spark Backend Support

2015-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721270#comment-14721270
 ] 

Lewis John McGibbney commented on GORA-386:
---

OK folks. I have reviewed the PR and I am quite happy to commit this unless 
there are objections. Nice work [~kamaci].

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.6.1
>
> Attachments: connection_refused.txt
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-228) java.util.ConcurrentModificationException when using MemStore for concurrent tests

2015-08-29 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-228.
---
Resolution: Fixed

> java.util.ConcurrentModificationException when using MemStore for concurrent 
> tests
> --
>
> Key: GORA-228
> URL: https://issues.apache.org/jira/browse/GORA-228
> Project: Apache Gora
>  Issue Type: Sub-task
>  Components: gora-core
>Affects Versions: 0.3
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
> Attachments: GORA-228.patch, GORA-228v2.patch
>
>
> Finally, a multithreaded test in [3] fails with the following
> {code}
> java.util.ConcurrentModificationException
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
>   at 
> org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
>   at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
>   at 
> org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
>   at 
> org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> {code}
> I believe that the final failure is due to to the use of TreeMap [5] as a 
> private object in MemStore. TreeMap implementations are not synchronized. If 
> multiple threads access a map concurrently, and at least one of the threads 
> modifies the map structurally, it must be synchronized externally. (A 
> structural modification is any operation that adds or deletes one or more 
> mappings; merely changing the value associated with an existing key is not a 
> structural modification.) This is typically accomplished by synchronizing on 
> some object that naturally encapsulates the map. If no such object exists, 
> the map should be "wrapped" using the Collections.synchronizedSortedMap 
> method. This is best done at creation time, to prevent accidental 
> unsynchronized access to the map e.g.
>SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
> N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.
> [3] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
> [4] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
> [5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (GORA-430) Address use of deprecated API's in 0.7

2015-08-29 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-430:
-

 Summary: Address use of deprecated API's in 0.7
 Key: GORA-430
 URL: https://issues.apache.org/jira/browse/GORA-430
 Project: Apache Gora
  Issue Type: Bug
  Components: build process
Affects Versions: 0.6.1
Reporter: Lewis John McGibbney
 Fix For: 0.7


We use deprecated API's in a number of places. Maven highlights this and it is 
present in the build output.
We could address this in the 0.7 development drive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [discuss] roadmap for next release

2015-08-29 Thread Lewis John Mcgibbney
ACK Renato good idea.

On Sat, Aug 29, 2015 at 1:19 PM,  wrote:

>
>
> I think we should release as well, specially because gora-mongo has had
> some good improvements :)
> Maybe we should do like a minor release because the gora-cassandra stuff is
> kinda broken right now, and I don't think we should do an official release
> if some modules are not working properly.
>
>
master branch is running off of 0.6.1. I would be happy to package and ship
it it and state that we have a major limitation in persisting nested data
structures into gora-cassandra.
We have some issues to contend with. I am working on them and will have
PR's very shortly.
https://issues.apache.org/jira/browse/GORA-386?jql=project%20%3D%20GORA%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%200.6.1%20ORDER%20BY%20priority%20DESC


[jira] [Commented] (GORA-428) Null pointer exception caused by incorrect handling of gora.mongodb.login values that don't validate

2015-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721246#comment-14721246
 ] 

Lewis John McGibbney commented on GORA-428:
---

Thanks [~kevinfindlay] I'm working to tie up some issues today and we can 
hopefully roll an RC at the end of the weekend.

> Null pointer exception caused by incorrect handling of gora.mongodb.login 
> values that don't validate
> 
>
> Key: GORA-428
> URL: https://issues.apache.org/jira/browse/GORA-428
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-mongodb
>Affects Versions: 0.5
> Environment: Ubuntu, Nutch2
>Reporter: Kevin Findlay
>Priority: Minor
> Fix For: 0.7
>
>
> A null pointer exception occurs when the gora.mongodb.login=nutch does not 
> validate. 
> Line 250 of the MongoStore class in  method "private DB getDB(final String 
> servers, final String dbname, final String login, final String secret) throws 
> UnknownHostException" returns a null when the user is not validated. 
> if (login != null && secret != null) {
>   PROBLEMauth = db.authenticate(login, secret.toCharArray());
> }
> This causes the method to return a null to :
>   mongoClientDB = getDB(vPropMongoServers, vPropMongoDb, vPropMongoLogin, 
>vPropMongoSecret);
> In line 173 of MongoStore  "public void initialize(final Class keyClass,
>   final Class pPersistentClass, final Properties properties)"
> The code attempts to use this null value
> *PROBLEM 
> asmongoClientColl = mongoClientDB
>   .getCollection(mapping.getCollectionName());
> The solution is to check for a null point for mongoClientDB prior to use or, 
> better, write specific code to report when a login to MongoDB fails due to 
> authentication.
> ERROR MESSAGE
> InjectorJob: org.apache.gora.util.GoraException: 
> java.lang.NullPointerException
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:169)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:137)
>   at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
>   at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:173)
>   at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:104)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:163)
>   ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-228) java.util.ConcurrentModificationException when using MemStore for concurrent tests

2015-08-29 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-228:
--
Fix Version/s: (was: 0.7)
   0.6.1

> java.util.ConcurrentModificationException when using MemStore for concurrent 
> tests
> --
>
> Key: GORA-228
> URL: https://issues.apache.org/jira/browse/GORA-228
> Project: Apache Gora
>  Issue Type: Sub-task
>  Components: gora-core
>Affects Versions: 0.3
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
> Attachments: GORA-228.patch, GORA-228v2.patch
>
>
> Finally, a multithreaded test in [3] fails with the following
> {code}
> java.util.ConcurrentModificationException
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
>   at 
> org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
>   at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
>   at 
> org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
>   at 
> org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> {code}
> I believe that the final failure is due to to the use of TreeMap [5] as a 
> private object in MemStore. TreeMap implementations are not synchronized. If 
> multiple threads access a map concurrently, and at least one of the threads 
> modifies the map structurally, it must be synchronized externally. (A 
> structural modification is any operation that adds or deletes one or more 
> mappings; merely changing the value associated with an existing key is not a 
> structural modification.) This is typically accomplished by synchronizing on 
> some object that naturally encapsulates the map. If no such object exists, 
> the map should be "wrapped" using the Collections.synchronizedSortedMap 
> method. This is best done at creation time, to prevent accidental 
> unsynchronized access to the map e.g.
>SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
> N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.
> [3] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
> [4] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
> [5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-428) Null pointer exception caused by incorrect handling of gora.mongodb.login values that don't validate

2015-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721148#comment-14721148
 ] 

Lewis John McGibbney commented on GORA-428:
---

Would be a huge help sir.




-- 
*Lewis*


> Null pointer exception caused by incorrect handling of gora.mongodb.login 
> values that don't validate
> 
>
> Key: GORA-428
> URL: https://issues.apache.org/jira/browse/GORA-428
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-mongodb
>Affects Versions: 0.5
> Environment: Ubuntu, Nutch2
>Reporter: Kevin Findlay
>Priority: Minor
> Fix For: 0.7
>
>
> A null pointer exception occurs when the gora.mongodb.login=nutch does not 
> validate. 
> Line 250 of the MongoStore class in  method "private DB getDB(final String 
> servers, final String dbname, final String login, final String secret) throws 
> UnknownHostException" returns a null when the user is not validated. 
> if (login != null && secret != null) {
>   PROBLEMauth = db.authenticate(login, secret.toCharArray());
> }
> This causes the method to return a null to :
>   mongoClientDB = getDB(vPropMongoServers, vPropMongoDb, vPropMongoLogin, 
>vPropMongoSecret);
> In line 173 of MongoStore  "public void initialize(final Class keyClass,
>   final Class pPersistentClass, final Properties properties)"
> The code attempts to use this null value
> *PROBLEM 
> asmongoClientColl = mongoClientDB
>   .getCollection(mapping.getCollectionName());
> The solution is to check for a null point for mongoClientDB prior to use or, 
> better, write specific code to report when a login to MongoDB fails due to 
> authentication.
> ERROR MESSAGE
> InjectorJob: org.apache.gora.util.GoraException: 
> java.lang.NullPointerException
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:169)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:137)
>   at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
>   at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:173)
>   at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:104)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:163)
>   ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-416:
--
Fix Version/s: (was: 0.6.1)
   0.7

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.7
>
> Attachments: GORA-416.patch
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: InvalidRequestException(why:supercolumn parameter is not optional 
> for super CF sc)
>   at 
> org.apache.cassandra.thrift.Cassandra$batch_mutate_resu

[jira] [Updated] (GORA-428) Null pointer exception caused by incorrect handling of gora.mongodb.login values that don't validate

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-428:
--
Fix Version/s: 0.7

> Null pointer exception caused by incorrect handling of gora.mongodb.login 
> values that don't validate
> 
>
> Key: GORA-428
> URL: https://issues.apache.org/jira/browse/GORA-428
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-mongodb
>Affects Versions: 0.5
> Environment: Ubuntu, Nutch2
>Reporter: Kevin Findlay
>Priority: Minor
> Fix For: 0.7
>
>
> A null pointer exception occurs when the gora.mongodb.login=nutch does not 
> validate. 
> Line 250 of the MongoStore class in  method "private DB getDB(final String 
> servers, final String dbname, final String login, final String secret) throws 
> UnknownHostException" returns a null when the user is not validated. 
> if (login != null && secret != null) {
>   PROBLEMauth = db.authenticate(login, secret.toCharArray());
> }
> This causes the method to return a null to :
>   mongoClientDB = getDB(vPropMongoServers, vPropMongoDb, vPropMongoLogin, 
>vPropMongoSecret);
> In line 173 of MongoStore  "public void initialize(final Class keyClass,
>   final Class pPersistentClass, final Properties properties)"
> The code attempts to use this null value
> *PROBLEM 
> asmongoClientColl = mongoClientDB
>   .getCollection(mapping.getCollectionName());
> The solution is to check for a null point for mongoClientDB prior to use or, 
> better, write specific code to report when a login to MongoDB fails due to 
> authentication.
> ERROR MESSAGE
> InjectorJob: org.apache.gora.util.GoraException: 
> java.lang.NullPointerException
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:169)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:137)
>   at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
>   at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:173)
>   at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:104)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:163)
>   ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (GORA-228) java.util.ConcurrentModificationException when using MemStore for concurrent tests

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened GORA-228:
---
  Assignee: Lewis John McGibbney  (was: Yasin Kılınç)

This is not fixed. 
I've reverted the commit, as I stupidly tested it against Nutch and not against 
Gora tests. Doh. Sorry folks.

> java.util.ConcurrentModificationException when using MemStore for concurrent 
> tests
> --
>
> Key: GORA-228
> URL: https://issues.apache.org/jira/browse/GORA-228
> Project: Apache Gora
>  Issue Type: Sub-task
>  Components: gora-core
>Affects Versions: 0.3
>Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
> Fix For: 0.7
>
> Attachments: GORA-228.patch, GORA-228v2.patch
>
>
> Finally, a multithreaded test in [3] fails with the following
> {code}
> java.util.ConcurrentModificationException
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
>   at 
> org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
>   at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
>   at 
> org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
>   at 
> org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> {code}
> I believe that the final failure is due to to the use of TreeMap [5] as a 
> private object in MemStore. TreeMap implementations are not synchronized. If 
> multiple threads access a map concurrently, and at least one of the threads 
> modifies the map structurally, it must be synchronized externally. (A 
> structural modification is any operation that adds or deletes one or more 
> mappings; merely changing the value associated with an existing key is not a 
> structural modification.) This is typically accomplished by synchronizing on 
> some object that naturally encapsulates the map. If no such object exists, 
> the map should be "wrapped" using the Collections.synchronizedSortedMap 
> method. This is best done at creation time, to prevent accidental 
> unsynchronized access to the map e.g.
>SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
> N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.
> [3] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
> [4] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
> [5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (GORA-429) Implement Maven forbidden-apis plugin in Gora

2015-08-28 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-429:
-

 Summary: Implement Maven forbidden-apis plugin in Gora
 Key: GORA-429
 URL: https://issues.apache.org/jira/browse/GORA-429
 Project: Apache Gora
  Issue Type: Improvement
  Components: build process
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 0.6.1


The [forbidden-apis Maven 
plugin|https://github.com/policeman-tools/forbidden-apis] allow us to parse 
Java byte code to find invocations of method/class/field signatures and fail 
build 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-386) Gora Spark Backend Support

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-386:
--
Fix Version/s: (was: 0.7)
   0.6.1

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.6.1
>
> Attachments: connection_refused.txt
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-428) Null pointer exception caused by incorrect handling of gora.mongodb.login values that don't validate

2015-08-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720505#comment-14720505
 ] 

Lewis John McGibbney commented on GORA-428:
---

[~kevinfindlay] thank you for reporting, I only just noticed this and I am not 
actively using MongoStore. Are you able to provide a patch?

> Null pointer exception caused by incorrect handling of gora.mongodb.login 
> values that don't validate
> 
>
> Key: GORA-428
> URL: https://issues.apache.org/jira/browse/GORA-428
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-mongodb
>Affects Versions: 0.5
> Environment: Ubuntu, Nutch2
>Reporter: Kevin Findlay
>Priority: Minor
>
> A null pointer exception occurs when the gora.mongodb.login=nutch does not 
> validate. 
> Line 250 of the MongoStore class in  method "private DB getDB(final String 
> servers, final String dbname, final String login, final String secret) throws 
> UnknownHostException" returns a null when the user is not validated. 
> if (login != null && secret != null) {
>   PROBLEMauth = db.authenticate(login, secret.toCharArray());
> }
> This causes the method to return a null to :
>   mongoClientDB = getDB(vPropMongoServers, vPropMongoDb, vPropMongoLogin, 
>vPropMongoSecret);
> In line 173 of MongoStore  "public void initialize(final Class keyClass,
>   final Class pPersistentClass, final Properties properties)"
> The code attempts to use this null value
> *PROBLEM 
> asmongoClientColl = mongoClientDB
>   .getCollection(mapping.getCollectionName());
> The solution is to check for a null point for mongoClientDB prior to use or, 
> better, write specific code to report when a login to MongoDB fails due to 
> authentication.
> ERROR MESSAGE
> InjectorJob: org.apache.gora.util.GoraException: 
> java.lang.NullPointerException
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:169)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:137)
>   at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
>   at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>   at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:173)
>   at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:104)
>   at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:163)
>   ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-418) Multi Execution Engine for Gora

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-418:
--
Fix Version/s: 0.7

> Multi Execution Engine for Gora
> ---
>
> Key: GORA-418
> URL: https://issues.apache.org/jira/browse/GORA-418
> Project: Apache Gora
>  Issue Type: New Feature
>Reporter: Talat UYARER
>  Labels: gsoc2015
> Fix For: 0.7
>
>
> At the present we support mapreduce engine. We can support multi execution 
> engine for GORA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-425) Allow different tables per DataStore object

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-425:
--
Fix Version/s: 0.7

> Allow different tables per DataStore object
> ---
>
> Key: GORA-425
> URL: https://issues.apache.org/jira/browse/GORA-425
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-hbase
>Affects Versions: 0.6
>Reporter: Renato Javier Marroquín Mogrovejo
> Fix For: 0.7
>
>
> Issue described by Harry Papaxenopoulos in the mailing list.
> http://www.mail-archive.com/dev%40gora.apache.org/msg05949.html
> We should extend this feature to the other data stores as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-421) PersistentBase#setDirty() does not set dirty

2015-08-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-421:
--
Fix Version/s: 0.7

> PersistentBase#setDirty() does not set dirty
> 
>
> Key: GORA-421
> URL: https://issues.apache.org/jira/browse/GORA-421
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-core
>Affects Versions: 0.4, 0.5, 0.6
>Reporter: Alfonso Nishikawa
> Fix For: 0.7
>
>
> {{PersistentBase#setDirty()}} sets [dirty with value -128 = 
> 1000|https://github.com/apache/gora/blob/master/gora-core/src/main/java/org/apache/gora/persistency/impl/PersistentBase.java#L164],
>  when should be -1=
> Will be desirable some tests too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-08-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720499#comment-14720499
 ] 

Lewis John McGibbney commented on GORA-416:
---

[~renato2099] do you want to close this off as do not fix and then just work on 
the datastax java driver code?

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: GORA-416.patch
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: InvalidRequestException(why:supercolumn 

[jira] [Commented] (GORA-417) Deploy Hadoop-1 compatible binaries for Apache Gora

2015-08-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720497#comment-14720497
 ] 

Lewis John McGibbney commented on GORA-417:
---

Hi [~hsaputra] is this addressed? I can potentially work on it if you would 
like.

> Deploy Hadoop-1 compatible binaries for Apache Gora
> ---
>
> Key: GORA-417
> URL: https://issues.apache.org/jira/browse/GORA-417
> Project: Apache Gora
>  Issue Type: Task
>Affects Versions: 0.6
>Reporter: Henry Saputra
> Fix For: 0.6.1
>
>
> Since we are now supporting both Hadoop 1.x and 2.x we need to deploy 
> binaries to maven central to support both version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-228) java.util.ConcurrentModificationException when using MemStore for concurrent tests

2015-08-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720494#comment-14720494
 ] 

Lewis John McGibbney commented on GORA-228:
---

Thank you [~icebergx5] and [~cguzel]

> java.util.ConcurrentModificationException when using MemStore for concurrent 
> tests
> --
>
> Key: GORA-228
> URL: https://issues.apache.org/jira/browse/GORA-228
> Project: Apache Gora
>  Issue Type: Sub-task
>  Components: gora-core
>Affects Versions: 0.3
>    Reporter: Lewis John McGibbney
>Assignee: Yasin Kılınç
> Fix For: 0.7
>
> Attachments: GORA-228.patch, GORA-228v2.patch
>
>
> Finally, a multithreaded test in [3] fails with the following
> {code}
> java.util.ConcurrentModificationException
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
>   at 
> java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
>   at 
> org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
>   at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
>   at 
> org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
>   at 
> org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
>   at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> {code}
> I believe that the final failure is due to to the use of TreeMap [5] as a 
> private object in MemStore. TreeMap implementations are not synchronized. If 
> multiple threads access a map concurrently, and at least one of the threads 
> modifies the map structurally, it must be synchronized externally. (A 
> structural modification is any operation that adds or deletes one or more 
> mappings; merely changing the value associated with an existing key is not a 
> structural modification.) This is typically accomplished by synchronizing on 
> some object that naturally encapsulates the map. If no such object exists, 
> the map should be "wrapped" using the Collections.synchronizedSortedMap 
> method. This is best done at creation time, to prevent accidental 
> unsynchronized access to the map e.g.
>SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
> N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.
> [3] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
> [4] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
> [5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[discuss] roadmap for next release

2015-08-27 Thread Lewis John Mcgibbney
hi Folks,
We've not released Gora in a while and a number of bugs have been fixed.
Is there any interest in shipping a release? If so what are the clean up
items?
I am happy to do RM role.
Thanks
Lewis


-- 
*Lewis*


Re: Connection Problem at Testing of GoraSparkEngine at Hbase

2015-08-26 Thread Lewis John Mcgibbney
Hi Furkan,

On Wed, Aug 26, 2015 at 8:54 PM,  wrote:

>
> I've finished my GSoC project but I have a problem. I've implemented a
> Spark backend for Gora and I've written a word count test class for it.
>

Yep. Nice work.


>
> Here is my particular test method:
>
>
> https://github.com/kamaci/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/mapreduce/TestHBaseStoreWordCount.java#L65
>

Actually the Test and Method is here
https://github.com/kamaci/gora/blob/master/gora-core/src/examples/java/org/apache/gora/examples/spark/SparkWordCount.java



>
>
> When I run my test there is no need to startup an Hbase cluster because
> Spark will connect to my dummy cluster. However when I run my test method
> it throws an error. Here is a part from stack trace:
>
>
OK, maybe I am interpreting things incorrectly here however unless you want
SparkWordCount.java to be run for every datastore then it should be tested
ONLY by GoraSparkEngine... no?

I need to also comment that I think it is important for you to implement
our base test suite defined within

https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java

and implemented via

https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestBase.java

Does this make sense? We need to have the GoraSparkEngine displaying that
it is able to store, operate upon and provide access to all of the test
cases provided under the base test suite.
This is also detailed within our documentation
http://gora.apache.org/current/index.html#gora-testing

If I were you I would implement the Unit tests under
https://github.com/kamaci/gora/tree/master/gora-core/src/test/java/org/apache/gora/spark
or something like that.

[snip]


>
> Any ideas about solving that connection problem?
>

I wouldn't worry about it right now. I would address the questions above
first and tell me/us what you think about that. We can then move on to the
HBase cluster at a later stage.


>
> PS 1: I've ignored the test at my Github repository.
> PS 2: I don't think that there is a problem Spark side.
> PS 3: I'll upload full stack trace to
> https://issues.apache.org/jira/browse/GORA-386
>

Great thank you.
Lewis


[jira] [Commented] (GORA-386) Gora Spark Backend Support

2015-08-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716054#comment-14716054
 ] 

Lewis John McGibbney commented on GORA-386:
---

Can you also add a page to the current documentation detailing your additions 
e.g. GoraSparkEngine?
This would go in 
http://svn.apache.org/repos/asf/gora/site/trunk/content/current/

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.7
>
> Attachments: connection_refused.txt
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-386) Gora Spark Backend Support

2015-08-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716053#comment-14716053
 ] 

Lewis John McGibbney commented on GORA-386:
---

[~kamaci] please address the above comments when you have the time. Thank you.

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.7
>
> Attachments: connection_refused.txt
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-419) AccumuloStore.put deletes entire row when updating map/array field

2015-08-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-419.
---
Resolution: Fixed

Counting objects: 13, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (13/13), 1.13 KiB | 0 bytes/s, done.
Total 13 (delta 4), reused 0 (delta 0)
To https://git-wip-us.apache.org/repos/asf/gora.git
   1f6ba32..ed768b4  master -> master

Nice work [~gerhard.gossen] thank you

> AccumuloStore.put deletes entire row when updating map/array field
> --
>
> Key: GORA-419
> URL: https://issues.apache.org/jira/browse/GORA-419
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-accumulo
>Affects Versions: 0.5, 0.6
> Environment: Gora 0.5
> Accumulo 1.5.1
> Zookeeper 3.4.6
> Hadoop 1.2.1
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
>Priority: Critical
>
> In {{AccumuloStore.put(k, v)}} fields of type MAP or ARRAY are cleared first 
> before they are set to the new value. This is done in the methods 
> {{putMap}}/{{putArray}} using a call to {{deleteByQuery(q)}}. The name for 
> fields to be deleted is taken from the current column. However, 
> {{deleteByQuery}} tries to translate the field names of the query to column 
> names again, which fails with a log message like
> {code}
> 2015-04-13 13:43:35.084 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore: Mapping not found for field: ol
> 2015-04-13 13:43:35.104 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore: Mapping not found for field: mk
> 2015-04-13 13:43:35.115 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore: Mapping not found for field: mtdt
> {code}
> As a result, the query is not restricted to any field and the *entire row is 
> deleted*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-419) AccumuloStore.put deletes entire row when updating map/array field

2015-08-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-419:
--
Assignee: Gerhard Gossen

> AccumuloStore.put deletes entire row when updating map/array field
> --
>
> Key: GORA-419
> URL: https://issues.apache.org/jira/browse/GORA-419
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-accumulo
>Affects Versions: 0.5, 0.6
> Environment: Gora 0.5
> Accumulo 1.5.1
> Zookeeper 3.4.6
> Hadoop 1.2.1
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
>Priority: Critical
>
> In {{AccumuloStore.put(k, v)}} fields of type MAP or ARRAY are cleared first 
> before they are set to the new value. This is done in the methods 
> {{putMap}}/{{putArray}} using a call to {{deleteByQuery(q)}}. The name for 
> fields to be deleted is taken from the current column. However, 
> {{deleteByQuery}} tries to translate the field names of the query to column 
> names again, which fails with a log message like
> {code}
> 2015-04-13 13:43:35.084 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore: Mapping not found for field: ol
> 2015-04-13 13:43:35.104 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore: Mapping not found for field: mk
> 2015-04-13 13:43:35.115 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore: Mapping not found for field: mtdt
> {code}
> As a result, the query is not restricted to any field and the *entire row is 
> deleted*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-420) AccumuloStore.createSchema fails when table already exists

2015-08-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-420:
--
Assignee: Gerhard Gossen

> AccumuloStore.createSchema fails when table already exists
> --
>
> Key: GORA-420
> URL: https://issues.apache.org/jira/browse/GORA-420
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-accumulo
>Affects Versions: 0.6
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
>Priority: Minor
>  Labels: patch
>
> When {{autoCreateSchema}} is enabled, AccumuloStore.initialize will try to 
> create the table each time without checking for its existence. This fails 
> with an exception that is logged at level ERROR (see below). As this happens 
> frequently, this clutters the log.
> {code}
> 2015-04-15 16:19:52.193 ERROR 29747 --- o.a.gora.accumulo.store.AccumuloStore 
>: Table crawl_2_webpage exists (Table name already exists: crawl_2_webpage)
> org.apache.accumulo.core.client.TableExistsException: Table crawl_2_webpage 
> exists (Table name already exists: crawl_2_webpage)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:302)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:280)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.create(TableOperationsImpl.java:208)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.create(TableOperationsImpl.java:177)
>   at 
> org.apache.gora.accumulo.store.AccumuloStore.createSchema(AccumuloStore.java:454)
>   at 
> org.apache.gora.accumulo.store.AccumuloStore.initialize(AccumuloStore.java:372)
> [...]
> Caused by: 
> org.apache.accumulo.core.client.impl.thrift.ThriftTableOperationException: 
> null
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$executeTableOperation_result$executeTableOperation_resultStandardScheme.read(MasterClientService.java:16129)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$executeTableOperation_result$executeTableOperation_resultStandardScheme.read(MasterClientService.java:16106)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$executeTableOperation_result.read(MasterClientService.java:16048)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_executeTableOperation(MasterClientService.java:499)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.executeTableOperation(MasterClientService.java:480)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.executeTableOperation(TableOperationsImpl.java:236)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:289)
>   ... 99 common frames omitted
> {code}
> Suggested fix is to check for the existence of the table before calling 
> {{createSchema()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-420) AccumuloStore.createSchema fails when table already exists

2015-08-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-420.
---
Resolution: Fixed

Counting objects: 13, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (13/13), 1.10 KiB | 0 bytes/s, done.
Total 13 (delta 4), reused 0 (delta 0)
To https://git-wip-us.apache.org/repos/asf/gora.git
   a3f4425..1f6ba32  master -> master

Nice work [~gerhard.gossen]

> AccumuloStore.createSchema fails when table already exists
> --
>
> Key: GORA-420
> URL: https://issues.apache.org/jira/browse/GORA-420
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-accumulo
>Affects Versions: 0.6
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
>Priority: Minor
>  Labels: patch
>
> When {{autoCreateSchema}} is enabled, AccumuloStore.initialize will try to 
> create the table each time without checking for its existence. This fails 
> with an exception that is logged at level ERROR (see below). As this happens 
> frequently, this clutters the log.
> {code}
> 2015-04-15 16:19:52.193 ERROR 29747 --- o.a.gora.accumulo.store.AccumuloStore 
>: Table crawl_2_webpage exists (Table name already exists: crawl_2_webpage)
> org.apache.accumulo.core.client.TableExistsException: Table crawl_2_webpage 
> exists (Table name already exists: crawl_2_webpage)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:302)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:280)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.create(TableOperationsImpl.java:208)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.create(TableOperationsImpl.java:177)
>   at 
> org.apache.gora.accumulo.store.AccumuloStore.createSchema(AccumuloStore.java:454)
>   at 
> org.apache.gora.accumulo.store.AccumuloStore.initialize(AccumuloStore.java:372)
> [...]
> Caused by: 
> org.apache.accumulo.core.client.impl.thrift.ThriftTableOperationException: 
> null
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$executeTableOperation_result$executeTableOperation_resultStandardScheme.read(MasterClientService.java:16129)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$executeTableOperation_result$executeTableOperation_resultStandardScheme.read(MasterClientService.java:16106)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$executeTableOperation_result.read(MasterClientService.java:16048)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_executeTableOperation(MasterClientService.java:499)
>   at 
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.executeTableOperation(MasterClientService.java:480)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.executeTableOperation(TableOperationsImpl.java:236)
>   at 
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:289)
>   ... 99 common frames omitted
> {code}
> Suggested fix is to check for the existence of the table before calling 
> {{createSchema()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-386) Gora Spark Backend Support

2015-08-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715663#comment-14715663
 ] 

Lewis John McGibbney commented on GORA-386:
---

Hi [~kamaci] can you please send a pull request so we can undertake further 
code review, test, etc?
Thank you

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.7
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Test fail

2015-08-25 Thread Lewis John Mcgibbney
Can you submit a patch to the issue please?

On Tue, Aug 25, 2015 at 10:15 AM, Cihad Guzel  wrote:

> Hi Lewis.
>
> The patch have a problem.
>
> T newObj = (T) obj.clone();
>
> The code line is not true, because (T) isn't clonable.
>
> 2015-08-25 10:06 GMT+03:00 Cihad Guzel :
>
>> Thanks Lewis.
>>
>> I try it and report back.
>>
>> 2015-08-25 9:59 GMT+03:00 Lewis John Mcgibbney > >:
>>
>>> Hi Cihad,
>>> Can you please try the patch below on Gora master branch
>>>
>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/GORA-228
>>>
>>> Then re-run your tests replacing the Gora core module with the newly
>>> patched version. Can you report back here and also comment on the issue of
>>> this fixes the problem.
>>> Thank you very much.
>>> Lewis
>>>
>>>
>>> On Monday, August 24, 2015, Cihad Guzel  wrote:
>>>
>>>> I developed it on nutch 2.x branch
>>>>
>>>> 2015-08-25 9:56 GMT+03:00 Cihad Guzel :
>>>>
>>>>>
>>>>> Hi Lewis.
>>>>>
>>>>> There are some error for Nutch Tests. The nutch test methods is
>>>>> ignored. I have consulted Talat and I changed datastore configuration from
>>>>> MemStore to HBaseStore for testing. I run only one test at a sitting. I
>>>>> have ignored the other test method. So, The test run successfully.  If I
>>>>> run multi test method, provide error. I don't understand it.
>>>>>
>>>>>
>>>>>
>>>>> 2015-08-25 8:02 GMT+03:00 Lewis John Mcgibbney <
>>>>> lewis.mcgibb...@gmail.com>:
>>>>>
>>>>>> Hi Cihad,
>>>>>>
>>>>>> Which version of Nutch 2.X are you working with when you get these
>>>>>> errors?
>>>>>>
>>>>>> On Sat, Aug 15, 2015 at 11:04 AM, 
>>>>>> wrote:
>>>>>>
>>>>>> >
>>>>>> >
>>>>>> > I run TestInjector. But there are an exeption as follow:
>>>>>> >
>>>>>> > java.util.NoSuchElementException
>>>>>> > at java.util.TreeMap.key(TreeMap.java:1221)
>>>>>> > at java.util.TreeMap.firstKey(TreeMap.java:285)
>>>>>> > at org.apache.gora.memory.store.MemStore.execute(MemStore.java:125)
>>>>>> > at
>>>>>> org.apache.nutch.util.CrawlTestUtil.readContents(CrawlTestUtil.java:114)
>>>>>> > at org.apache.nutch.crawl.TestInjector.readDb(TestInjector.java:108)
>>>>>> > at
>>>>>> org.apache.nutch.crawl.TestInjector.testInject(TestInjector.java:68)
>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> > at
>>>>>> >
>>>>>> >
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> > at ...
>>>>>> >
>>>>>> > How to solve it? How to run nutch tests?
>>>>>> >
>>>>>> >
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Lewis*
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> *Lewis*
>>>
>>>
>>
>


-- 
*Lewis*


Re: Spark Backend Support for Gora (GORA-386) Midterm Report

2015-07-01 Thread Lewis John Mcgibbney
This is fantastic.
Needless to say the project will be progressing through mid term.
Your blogging is very positive for dissemination of your work.
Also like to extend a personal thank you to Talat. Excellent job and on
behalf of the community here an exc potent effort to drive this GSOC
project so far only half way through :).
Looking forward to committing the initial patches into master branch and
also your LogManagerSpark which will lower the barrier to adopting the
module.
Thanks
Lewis

On Wednesday, July 1, 2015, Furkan KAMACI  wrote:

> Hi,
>
> First of all, I would like to thank all. As you know that I've been
> accepted to GSoC 2015 with my proposal for developing a Spark Backend
> Support for Gora (GORA-386) and it is the time for midterm evaluations. I
> want to share my current progress of project and my midterm proposal as
> well.
>
> During my GSoC period, I've blogged at my personal website (
> http://furkankamaci.com/) and created a fork from Apache Gora's master
> branch and worked on it: https://github.com/kamaci/gora
>
> At community bonding period, I've read Apache Gora documentation and
> Apache Gora source code to be more familiar
> with project. I've analyzed related projects including Apache Flink and
> Apache Crunch to implement a Spark backend into Apache Gora. I've picked up
> an issue from Jira (https://issues.apache.org/jira/browse/GORA-262) and
> fixed.
>
> At coding period, due to implementing this project needs an infrastructure
> about Apache Spark, I've started with analyzing Spark's first papers. I've
> analyzed “Spark: Cluster Computing with Working” (
> http://www.cs.berkeley.edu/~matei/papers/2010/hotcloud_spark.pdf) and
> “Resilient
> Distributed Datasets: A Fault-Tolerant Abstraction forIn-Memory Cluster
> Computing”
> (https://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf). I've
> published two posts about Spark and Cluster Computing
> (http://furkankamaci.com/spark-and-cluster-computing/) and Resilient
> Distributed Datasets (
> http://furkankamaci.com/resilient-distributed-datasets-rdds/) at my
> personal blog. I've followed Apache Spark documentation and developed
> examples to analyze RDDs.
>
> I've analyzed Apache Gora's GoraInputFormat class and Spark's newHadoopRDD
> method. I've implemented an example application to read data from Hbase.
>
> Apache Gora supports reading/writing data from/to Hadoop files. Spark has
> a method for generating an RDD compatible with Hadoop files. So, an
> architecture is designed which creates a bridge between GoraInputFormat and
> RDD due to both of them support Hadoop files.
>
> I've created a base class for Apache Gora and Spark integration named as:
> GoraSparkEngine. It has initialize methods that takes Spark context, data
> store, optional Hadoop configuration and returns an RDD.
>
> After implementing a base for GoraSpark engine, I've developed a new
> example aligned to LogAnalytics named as:
> LogAnalyticsSpark. I've developed map and reduce parts (except for writing
> results into database) which does the same thing as
> LogAnalytics and also something more i.e. printing number of lines in
> tables.
>
> When we get an RDD from GoraSpark engine, we can do the operations over it
> as like making operations on any other RDDs which is not created over
> Apache Gora. Whole code can be checked from code base:
> https://github.com/kamaci/gora
>
> Project progress is ahead from the proposed timeline up to now.
> GoraInputFormat and RDD transformation is done and it is shown that map,
> reduce and other methods can properly work on that kind of RDDs.
>
> Before the next steps, I am planning to design an overall architecture
> according to feedbacks from community (there are some
> prerequisites when designing an architecture: i.e. configuration of a
> context at Spark cannot be changed after context has been initialized).
>
> When necessary functionalities are implemented examples, tests and
> documentations will be done. After that if I have extra time, I'm planning
> to make a performance benchmark of Apache Gora with Hadoop MapReduce,
> Hadoop MapReduce, Apache Spark and Apache Gora with Spark as well.
>
> Special thanks to Lewis and Talat. I should also mention that it is a real
> chance to be able to talk with your mentor face to face. We met with Talat
> many times and he helped me a lot about how Hadoop and Apache Gora works.
>
> PS: I've attached my midterm report and my previous reports can be found
> here:
>
> https://cwiki.apache.org/confluence/display/GORA/Spark+Backend+Support+for+Gora+%28GORA-386%29+Reports
>
> Kind Regards,
> Furkan KAMACI
>


-- 
*Lewis*


Re: "Apache:Big Data" Event

2015-06-23 Thread Lewis John Mcgibbney
Hey Alfonso,

On Tue, Jun 23, 2015 at 2:35 PM,  wrote:

>
> I expect you already have news about it, but I have been asked by Angela
> Brown, Senior Director of Events in The Linux Foundation, if anyone of you
> would be interested in submitting a proposal to "Apache: Big Data".
>
> http://www.apachecon.com/
> http://events.linuxfoundation.org/events/apache-big-data-europe
>
> The event is just before ApacheCon in the same place.
> Feel free to share :)
>
>
Nice, are you putting something in? I can't make it over to ApacheCon EU
this year. Have fun.
Lewis


[jira] [Updated] (GORA-386) Gora Spark Backend Support

2015-06-15 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-386:
--
Assignee: Furkan KAMACI

> Gora Spark Backend Support
> --
>
> Key: GORA-386
> URL: https://issues.apache.org/jira/browse/GORA-386
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-core
>Reporter: Talat UYARER
>Assignee: Furkan KAMACI
>  Labels: gsoc2015
> Fix For: 0.7
>
>
> Now Gora supports Map Reduce Framework. With this umbrella issue we try to 
> develop Apache Spark Backend. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-423) BSONDecorator returns empty string for null field value

2015-06-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568178#comment-14568178
 ] 

Lewis John McGibbney commented on GORA-423:
---

[~drazzib] can you review please? If not, can you comment and I will review. 
Thanks.

> BSONDecorator returns empty string for null field value
> ---
>
> Key: GORA-423
> URL: https://issues.apache.org/jira/browse/GORA-423
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-mongodb
>Reporter: Alexander Yastrebov
>
> BSONDecorator returns empty string for null field value
> See pull request
> https://github.com/apache/gora/pull/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-423) BSONDecorator returns empty string for null field value

2015-06-01 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-423:
--
Fix Version/s: 0.6.1

> BSONDecorator returns empty string for null field value
> ---
>
> Key: GORA-423
> URL: https://issues.apache.org/jira/browse/GORA-423
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-mongodb
>Reporter: Alexander Yastrebov
> Fix For: 0.6.1
>
>
> BSONDecorator returns empty string for null field value
> See pull request
> https://github.com/apache/gora/pull/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-05-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564063#comment-14564063
 ] 

Lewis John McGibbney commented on GORA-416:
---

Hi [~talat] [~kamaci] should I put some time in to this one again? This is the 
pressing issue for release of 0.7.

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: GORA-416.patch
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: InvalidRequestException(why:supercolumn 

[jira] [Commented] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-05-19 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550897#comment-14550897
 ] 

Lewis John McGibbney commented on GORA-416:
---

Please see my initial pull request which I should have updated by have sadly 
not. It identifies (from within CassandraClient) the general area we need to 
focus on to ensure that nested RECORD's are persisted as super columns as per 
the current Cassandra data modeling we abide to!
For reference, I suggest that we change away from the old super column data 
modeling which was deprecated some time ago. We can deal with this in the 0.7 
development drive alongside GSoC.


> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: GORA-416.patch
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at 

[jira] [Commented] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-05-19 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550830#comment-14550830
 ] 

Lewis John McGibbney commented on GORA-416:
---

[~kamaci] would you like to have a crack at this one? I think it would be a 
great issue to work on within the community bonding period as essentially it is 
blocking a release. WDYT?

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: GORA-416.patch
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>

[jira] [Resolved] (GORA-262) Add support for HTTPClient authentication in gora-solr

2015-05-19 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-262.
---
Resolution: Fixed

Nice fix [~kamaci] thank you for this one

> Add support for HTTPClient authentication in gora-solr  
> 
>
> Key: GORA-262
> URL: https://issues.apache.org/jira/browse/GORA-262
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>    Reporter: Lewis John McGibbney
>Assignee: Furkan KAMACI
> Fix For: 0.6.1
>
> Attachments: GORA-262.patch
>
>
> This is the next logical progression once GORA-260 is addressed. Security is 
> always an issue when writing into Solr. This issue should introduce exactly 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-262) Add support for HTTPClient authentication in gora-solr

2015-05-19 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-262:
--
Fix Version/s: (was: 0.7)
   0.6.1

> Add support for HTTPClient authentication in gora-solr  
> 
>
> Key: GORA-262
> URL: https://issues.apache.org/jira/browse/GORA-262
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>    Reporter: Lewis John McGibbney
>Assignee: Furkan KAMACI
> Fix For: 0.6.1
>
> Attachments: GORA-262.patch
>
>
> This is the next logical progression once GORA-260 is addressed. Security is 
> always an issue when writing into Solr. This issue should introduce exactly 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-262) Add support for HTTPClient authentication in gora-solr

2015-05-18 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548283#comment-14548283
 ] 

Lewis John McGibbney commented on GORA-262:
---

This patch looks good. If there are no objections I'll commit by EoB today.

> Add support for HTTPClient authentication in gora-solr  
> 
>
> Key: GORA-262
> URL: https://issues.apache.org/jira/browse/GORA-262
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>    Reporter: Lewis John McGibbney
>Assignee: Furkan KAMACI
> Fix For: 0.7
>
> Attachments: GORA-262.patch
>
>
> This is the next logical progression once GORA-260 is addressed. Security is 
> always an issue when writing into Solr. This issue should introduce exactly 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-262) Add support for HTTPClient authentication in gora-solr

2015-05-18 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-262:
--
Assignee: Furkan KAMACI

> Add support for HTTPClient authentication in gora-solr  
> 
>
> Key: GORA-262
> URL: https://issues.apache.org/jira/browse/GORA-262
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>    Reporter: Lewis John McGibbney
>Assignee: Furkan KAMACI
> Fix For: 0.7
>
> Attachments: GORA-262.patch
>
>
> This is the next logical progression once GORA-260 is addressed. Security is 
> always an issue when writing into Solr. This issue should introduce exactly 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-262) Add support for HTTPClient authentication in gora-solr

2015-05-18 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-262:
--
Assignee: (was: Lewis John McGibbney)

> Add support for HTTPClient authentication in gora-solr  
> 
>
> Key: GORA-262
> URL: https://issues.apache.org/jira/browse/GORA-262
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>    Reporter: Lewis John McGibbney
> Fix For: 0.7
>
> Attachments: GORA-262.patch
>
>
> This is the next logical progression once GORA-260 is addressed. Security is 
> always an issue when writing into Solr. This issue should introduce exactly 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-422) is not a valid DOAP class

2015-05-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-422.
---
Resolution: Pending Closed

Argh thanks Seb. It's now done. 
Thank you
Lewis

>  is not a valid DOAP class
> 
>
> Key: GORA-422
> URL: https://issues.apache.org/jira/browse/GORA-422
> Project: Apache Gora
>  Issue Type: Bug
> Environment: 
> http://svn.apache.org/repos/asf/gora/cms_site/trunk/content/current/doap_Gora.rdf
>Reporter: Sebb
>Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
>
>   is not valid DOAP syntax.
> This is why there are no releases listed under
> http://projects.apache.org/projects/gora.html
> Please remove the  and  lines from the DOAP
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: GSoC Period of Spark Backend Support for Gora (GORA-386)

2015-05-11 Thread Lewis John Mcgibbney
Hi Furkan,

On Mon, May 11, 2015 at 11:14 AM,  wrote:

>
>
> What do you suggest me for next steps (everybody can comment on this, not
> just my mentors)?
>

I think this looks great. I would say 1) Just make sure that as much as
possible gets on to the wiki, 2) keep as much of your work on Jira this way
we all know what is going on 3) make reporting on GSoC gradual that way it
is more of a blog-type reporting where incremental updates will work much
better.


> On the other hand, Lewis and Talat, when do you want me to start weekly
> reporting process?
>
>
Right now you can begin by 1) making an attempt to stabilize trunk by
addressing the bug in
https://issues.apache.org/jira/browse/GORA-416
Then make an attempt to possibly pick up some other low hanging issues
which will not take too much investment of your time.
How does this sound?
Thanks
Lewis


[jira] [Updated] (GORA-422) is not a valid DOAP class

2015-05-11 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-422:
--
Fix Version/s: 0.6.1

>  is not a valid DOAP class
> 
>
> Key: GORA-422
> URL: https://issues.apache.org/jira/browse/GORA-422
> Project: Apache Gora
>  Issue Type: Bug
> Environment: 
> http://svn.apache.org/repos/asf/gora/cms_site/trunk/content/current/doap_Gora.rdf
>Reporter: Sebb
>    Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
>
>   is not valid DOAP syntax.
> This is why there are no releases listed under
> http://projects.apache.org/projects/gora.html
> Please remove the  and  lines from the DOAP
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-422) is not a valid DOAP class

2015-05-11 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-422.
---
Resolution: Fixed
  Assignee: Lewis John McGibbney

Committed @revision 1678785 in trunk

>  is not a valid DOAP class
> 
>
> Key: GORA-422
> URL: https://issues.apache.org/jira/browse/GORA-422
> Project: Apache Gora
>  Issue Type: Bug
> Environment: 
> http://svn.apache.org/repos/asf/gora/cms_site/trunk/content/current/doap_Gora.rdf
>Reporter: Sebb
>    Assignee: Lewis John McGibbney
>
>   is not valid DOAP syntax.
> This is why there are no releases listed under
> http://projects.apache.org/projects/gora.html
> Please remove the  and  lines from the DOAP
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-422) is not a valid DOAP class

2015-05-11 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538333#comment-14538333
 ] 

Lewis John McGibbney commented on GORA-422:
---

For those interested, the DOAP for Gora is available at 
https://svn.apache.org/repos/asf/gora/committers

>  is not a valid DOAP class
> 
>
> Key: GORA-422
> URL: https://issues.apache.org/jira/browse/GORA-422
> Project: Apache Gora
>  Issue Type: Bug
> Environment: 
> http://svn.apache.org/repos/asf/gora/cms_site/trunk/content/current/doap_Gora.rdf
>Reporter: Sebb
>    Assignee: Lewis John McGibbney
>
>   is not valid DOAP syntax.
> This is why there are no releases listed under
> http://projects.apache.org/projects/gora.html
> Please remove the  and  lines from the DOAP
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[REPORT] Apache Gora

2015-05-06 Thread Lewis John Mcgibbney
1198 ## Description:
1199The Apache Gora open source framework provides an in-memory data
model
1200and persistence for big data. Gora supports persisting to column
stores, key
1201value stores, document stores and RDBMSs, and analyzing the data
with
1202extensive Apache Hadoop MapReduce support.
1203
1204 ## Activity:
1205  - Project activity has been pretty low in last quarter with an issue
1206blocking the release of a bug fix for Gora 0.6.1. The goal is to
1207amend this and release the bug fix ASAP.
1208  - Gora has however AGAIN been accepted into the GSoC program so
1209renewed development will be kicking off on new features for the
1210codebase. More updates at the next reporting period.
1211
1212 ## Issues:
1213There are no issues requiring board attention at this time
1214
1215 ## PMC/Committership changes:
1216
1217  - Currently 20 committers and 20 PMC members in the project.
1218  - No new PMC members added in the last 3 months
1219  - Last PMC addition was Talat Uyarer at Mon Jan 26 2015
1220  - No new committers added in the last 3 months
1221  - Last committer addition was Talat Uyarer at Mon Jan 26 2015
1222
1223 ## Releases:
1224
1225  - Gora 0.6 was released on February 19th, 2015.
1226
1227 ## Mailing list activity:
1228
1229  - dev@gora.apache.org:
1230 - 72 subscribers (up 2 in the last 3 months):
1231 - 318 emails sent to list (158 in previous quarter)
1232
1233  - u...@gora.apache.org:
1234 - 64 subscribers (up 4 in the last 3 months):
1235 - 54 emails sent to list (20 in previous quarter)
1236
1237
1238 ## JIRA activity:
1239
1240  - 15 JIRA tickets created in the last 3 months
1241  - 7 JIRA tickets closed/resolved in the last 3 months

-- 
*Lewis*


[jira] [Updated] (GORA-377) Implement gora-metamodel module

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-377:
--
Labels: memex  (was: )

> Implement gora-metamodel module
> ---
>
> Key: GORA-377
> URL: https://issues.apache.org/jira/browse/GORA-377
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-metamodel
>    Reporter: Lewis John McGibbney
>  Labels: memex
> Fix For: 0.7
>
> Attachments: GORA-377_patch1.diff
>
>
> There have been recent discussions [0] around implementing MetaModel [1] in a 
> drive to improving the Gora Query model and functionality.
> This issue is merely a placeholder for implementing exctly that.
> [0] http://www.mail-archive.com/dev%40gora.apache.org/msg05149.html
> [1] http://metamodel.incubator.apache.org/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-376) Gora Cassandra doesn't accept user credentials for connection

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-376:
--
Labels: memex patch  (was: patch)

> Gora Cassandra doesn't accept user credentials for connection
> -
>
> Key: GORA-376
> URL: https://issues.apache.org/jira/browse/GORA-376
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.3, 0.4, 0.5
>Reporter: Viju Kothuvatiparambil
>Assignee: Viju Kothuvatiparambil
>  Labels: memex, patch
> Fix For: 0.6
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The following properties are defined in gora.properties
> gora.cassandrastore.servers=host:port
> gora.cassandrastore.username=username
> gora.cassandrastore.password=password
> Initialization method in CassandraClient.java only takes the host name, but 
> credentials are not passed to the constructor for CassandraHostConfigurator.
> public void initialize(Class keyClass, Class persistentClass) throws 
> Exception {
> 
> this.cluster = 
> HFactory.getOrCreateCluster(this.cassandraMapping.getClusterName(),
> new CassandraHostConfigurator(this.cassandraMapping.getHostName()));
> 
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-378) Log error trace as well as error message in GoraRecordWriter

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-378:
--
Labels: memex  (was: )

> Log error trace as well as error message in GoraRecordWriter
> 
>
> Key: GORA-378
> URL: https://issues.apache.org/jira/browse/GORA-378
> Project: Apache Gora
>  Issue Type: Task
>  Components: gora-core
>Affects Versions: 0.5
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
>  Labels: memex
> Fix For: 0.6
>
>
> Right now I am logging a rather annoying error when attempting to flush Super 
> Columns to Cassandra 2.0.7 which reads as follows
> 2014-09-26 20:43:15,847 WARN  mapreduce.GoraRecordWriter - Exception at 
> GoraRecordWriter.class while closing 
> datastore.InvalidRequestException(why:supercolumn parameter is not optional 
> for super CF sc)
> Yes, this is useful, however it would be better (for debugging purposes) if I 
> could catch an entire stack trace as well within the log output.
> Patch coming up for GoraRecordWriter which does just this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-374) Implement Rackspace Cloud Orchestration in GoraCI

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-374:
--
Labels: memex  (was: )

> Implement Rackspace Cloud Orchestration in GoraCI
> -
>
> Key: GORA-374
> URL: https://issues.apache.org/jira/browse/GORA-374
> Project: Apache Gora
>  Issue Type: Bug
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
>  Labels: memex
> Fix For: 0.6
>
>
> We need to implement the Rackspace Cloud orchestration API within gora-goraci 
> in order to run the test suite under INFRA RAX.
> I've implemented this and will send a PR soon.
> For reference
> https://developer.rackspace.com/docs/cloud-servers/getting-started/?lang=java
> http://jclouds.apache.org/reference/javadoc/1.8.x/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-381) Fix Guava dependency mismatch post GoraCI

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-381:
--
Labels: memex  (was: )

> Fix Guava dependency mismatch post GoraCI
> -
>
> Key: GORA-381
> URL: https://issues.apache.org/jira/browse/GORA-381
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-cassandra
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
>  Labels: memex
> Fix For: 0.6
>
>
> I recently broke the build after committing GORA-374.
> This only happened when I failed to flush my local .m2 cache and that is why 
> I didn't catch it in time.
> There is a mismatch in the Gurava versioning which means we get the following 
> when attempting to set up the local C* server for tests.
> jb-9-Data.db'), 
> SSTableReader(path='target/test/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-19-Data.db'),
>  
> SSTableReader(path='target/test/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-16-Data.db'),
>  
> SSTableReader(path='target/test/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-3-Data.db')]
> 14/09/28 04:16:40 ERROR service.CassandraDaemon: Exception in thread 
> Thread[CompactionExecutor:15,1,main]
> java.lang.NoSuchMethodError: 
> com.google.common.util.concurrent.RateLimiter.acquire(I)V
>   at 
> org.apache.cassandra.io.compress.CompressedThrottledReader.reBuffer(CompressedThrottledReader.java:40)
>   at 
> org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:280)
>   at 
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.computeNext(SSTableScanner.java:256)
>   at 
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.computeNext(SSTableScanner.java:197)
>   at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>   at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>   at 
> org.apache.cassandra.io.sstable.SSTableScanner.hasNext(SSTableScanner.java:177)
>   at 
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
>   at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.(MergeIterator.java:87)
>   at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterable.iterator(CompactionIterable.java:47)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:129)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> We need to address this before we move onwards. Possibly to exclude Guava 
> within the transient gora-core inheritence and then add it as an indivdual 
> import on the gora-cassandra side.
> We will soon find out once I hack this tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-407) Upgrade restlet dependencies to 2.3.1 for gora-solr

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-407:
--
Labels: memex  (was: )

> Upgrade restlet dependencies to 2.3.1 for gora-solr
> ---
>
> Key: GORA-407
> URL: https://issues.apache.org/jira/browse/GORA-407
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>Affects Versions: 0.5
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
>  Labels: memex
> Fix For: 0.6
>
>
> When building Nutch against gora-solr 0.6, we are getting dependency 
> retrieval errors with restlet 2.2.1 dependencies.
> This issue merely upgrade to 2.3.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-375) Upgrade HBase to 0.98

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-375:
--
Labels: hbase memex  (was: hbase)

> Upgrade HBase to 0.98
> -
>
> Key: GORA-375
> URL: https://issues.apache.org/jira/browse/GORA-375
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-hbase
>Affects Versions: 0.5
>Reporter: Ted Yu
>Assignee: Talat UYARER
>  Labels: hbase, memex
> Fix For: 0.6
>
> Attachments: GORA-375-hadoop2.patch, GORA-375.patch, 
> org.apache.gora.avro.mapreduce.TestDataFileAvroStoreMapReduce.txt
>
>
> HBase 0.98 release is the current stable release.
> Gora should be built based on HBase 0.98.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-406) Upgrade Solr dependencies to 4.10.3

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-406:
--
Labels: memex  (was: )

> Upgrade Solr dependencies to 4.10.3
> ---
>
> Key: GORA-406
> URL: https://issues.apache.org/jira/browse/GORA-406
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
>  Labels: memex
> Fix For: 0.6
>
>
> Usual upgrade of the Gora Solr components to Solr 4.10.X



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-346) Create shim layer to support multiple hadoop versions

2015-04-06 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-346:
--
Labels: memex patch  (was: patch)

> Create shim layer to support multiple hadoop versions
> -
>
> Key: GORA-346
> URL: https://issues.apache.org/jira/browse/GORA-346
> Project: Apache Gora
>  Issue Type: Improvement
>Affects Versions: 0.5
>Reporter: Renato Javier Marroquín Mogrovejo
>Assignee: Moritz Hoffmann
>  Labels: memex, patch
> Fix For: 0.5
>
> Attachments: GORA-345_MUNGE_v1.patch, GORA-346_v1.patch, 
> GORA-346_v2.patch, GORA-346_v3.patch, GORA-346_v4_profiles.patch, 
> GORA-346_v5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DEADLINE] Google Summer of Code Deadline Approaching Soon

2015-03-25 Thread Lewis John Mcgibbney
Hi All,
The deadline for this years GSoC student submissions is approaching fast
and I would be very keen to see more proposals from the communities above.
I've been involved on and off with several students from across all of the
above communtiies hence the reason I am emailing these lists.
I would strongly suggest that if any students are still planning on
submitting, to get the submissions in ASAP.
Thanks
Lewis


-- 
*Lewis*


[jira] [Commented] (GORA-98) Support CQL through Gora Hector API usage in Gora

2015-03-21 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14373002#comment-14373002
 ] 

Lewis John McGibbney commented on GORA-98:
--

+1
Please also check out GORA-267 iirc
We have numerous improvements pending.
Personally I would like to take time an think about making these changes.
The original patch submitted for 267 addresses composite key support
however it is not a migration to datastax Java driver for Gora.
I don't know if you are coding, but if you are both Renato and myself are
using the Gora Cassandra module. I recognize the bug and will fix it soon
on 40x issue.




-- 
*Lewis*


> Support CQL through Gora Hector API usage in Gora
> -
>
> Key: GORA-98
> URL: https://issues.apache.org/jira/browse/GORA-98
> Project: Apache Gora
>  Issue Type: Sub-task
>  Components: gora-cassandra
>Affects Versions: 0.2
>    Reporter: Lewis John McGibbney
> Fix For: 0.7
>
>
> CQL queries were an interesting new feature for Apache Cassandra 0.8.0. The 
> initial implementation in Hector deals simply with the single 
> execute_cql_query thrift method and is largely intended as a means to test 
> drive query functionality and behavior, it would be nice to implement this 
> within Gora. 
> reference: http://rantav.github.com/hector/build/html/content/cql_basics.html 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-98) Support CQL through Gora Hector API usage in Gora

2015-03-21 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372879#comment-14372879
 ] 

Lewis John McGibbney commented on GORA-98:
--

Hi Marko,
Your correct.
If you look, you will see an announcement that some years ago, that Gora
decided to move to Hector support.
I am on the hector lists, and although I disagree with your comment that
the code is unmaintained, I do agree that there default driver for
Cassandra is DataStax. There has been a pile of money, resources, effort
and public outreach poured into the driver development. For that reason, I
would certainly suggest that we close all hector related issues and port
the code to the Datastax Java API.
Previously, Renato Marroquin wrote a bunch of code which actually
implemented a pluggable storage abstraction layer within Gora-cassandra.
Although this meant that you were playing with Gora then an abstraction
then a Cassandra Client API then cassandra, it was extremely valid for
doing comparative analysis on Cassandra client behavior.
If you suggest we close hector related issues I am +1.
We currently have a major bug in persistence of nested UNION RECORDS in
Gora Cassandra. I submitted an initial, unfinished patch for review, I've
currently had no feedback and unfortunately I haven't had time to look more
thoroughly I'm sorry about that.
If you would like to drive this issue it would be great. If not, it is also
great. Thank you for writing and letting us know that this is a required
issue.





-- 
*Lewis*


> Support CQL through Gora Hector API usage in Gora
> -
>
> Key: GORA-98
> URL: https://issues.apache.org/jira/browse/GORA-98
> Project: Apache Gora
>  Issue Type: Sub-task
>  Components: gora-cassandra
>    Affects Versions: 0.2
>Reporter: Lewis John McGibbney
> Fix For: 0.7
>
>
> CQL queries were an interesting new feature for Apache Cassandra 0.8.0. The 
> initial implementation in Hector deals simply with the single 
> execute_cql_query thrift method and is largely intended as a means to test 
> drive query functionality and behavior, it would be nice to implement this 
> within Gora. 
> reference: http://rantav.github.com/hector/build/html/content/cql_basics.html 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-405) Create Gora REST API

2015-03-21 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372872#comment-14372872
 ] 

Lewis John McGibbney commented on GORA-405:
---

Hi Furkan, I don't believe anyone is Actively working on this. I got no 
feedback for my proposals for augmenting the existing Datastore API so moved in 
to moved on. I would really love you to work on this if you could, I would 
personally mentor a GSoC effort on a CXF powered Gora REST API.

> Create Gora REST API
> 
>
> Key: GORA-405
> URL: https://issues.apache.org/jira/browse/GORA-405
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-compiler, gora-core
>Reporter: Udesh Liyanaarachchi
>Assignee: Udesh Liyanaarachchi
>Priority: Minor
> Fix For: 0.7
>
>
> As to the discussion [~lewismc]  initiated in the 
> [mail list 
> thread|https://www.mail-archive.com/dev@gora.apache.org/msg05444.html] we 
> need to implement a REST API for GORA.
> This is the initial proposal documentation for the [GORA REST 
> API|http://docs.apachegoraapi.apiary.io/].
> The plan will be to implement the API with  [Apache CXF 
> |http://cxf.apache.org] using [CXF's JAXRS 
> |http://cxf.apache.org/docs/jax-rs.html] implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-405) Create Gora REST API

2015-03-21 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372873#comment-14372873
 ] 

Lewis John McGibbney commented on GORA-405:
---

Ps feel free to mark is GSOC if you want to work on it. 

> Create Gora REST API
> 
>
> Key: GORA-405
> URL: https://issues.apache.org/jira/browse/GORA-405
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-compiler, gora-core
>Reporter: Udesh Liyanaarachchi
>Assignee: Udesh Liyanaarachchi
>Priority: Minor
> Fix For: 0.7
>
>
> As to the discussion [~lewismc]  initiated in the 
> [mail list 
> thread|https://www.mail-archive.com/dev@gora.apache.org/msg05444.html] we 
> need to implement a REST API for GORA.
> This is the initial proposal documentation for the [GORA REST 
> API|http://docs.apachegoraapi.apiary.io/].
> The plan will be to implement the API with  [Apache CXF 
> |http://cxf.apache.org] using [CXF's JAXRS 
> |http://cxf.apache.org/docs/jax-rs.html] implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-262) Add support for HTTPClient authentication in gora-solr

2015-03-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361092#comment-14361092
 ] 

Lewis John McGibbney commented on GORA-262:
---

Hi [~kamaci], please see the type of authentication implemented in Apache Nutch 
for an example
https://github.com/apache/nutch/blob/trunk/src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrUtils.java#L34-L61

> Add support for HTTPClient authentication in gora-solr  
> 
>
> Key: GORA-262
> URL: https://issues.apache.org/jira/browse/GORA-262
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-solr
>    Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 0.7
>
>
> This is the next logical progression once GORA-260 is addressed. Security is 
> always an issue when writing into Solr. This issue should introduce exactly 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-03-12 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-416:
--
Attachment: GORA-416.patch

Initial patch aimed at fixing this issue. It is not working but I do not have 
time to work on it right now. Sorry for half baked patch. Hopefully someone 
else can take a look.
[~renato2099] sorry for not having PR earlier.

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: GORA-416.patch
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   a

Re: Google Summer of Code 2015 Mentor Registration

2015-03-11 Thread Lewis John Mcgibbney
ACK

On Wednesday, March 11, 2015, Talat Uyarer  wrote:

> Gora PMC,
>
> Please acknowledge my request to become a mentor for Google Summer of
> Code 2015 projects for Apache Gora.
>
> My Melange username is talat.
>
> --
> Talat UYARER
> Websitesi: http://talat.uyarer.com
> Twitter: http://twitter.com/talatuyarer
> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>


-- 
*Lewis*


Re: Google Summer of Code (again)

2015-03-10 Thread Lewis John Mcgibbney
Hi Talat,

On Tue, Mar 10, 2015 at 1:15 AM,  wrote:

>
> Date: Mon, 9 Mar 2015 21:36:58 +0200
> Subject: Re: Google Summer of Code (again)
> Hi Lewis,
>
> I found students who wants to contrib gora. Do we have any idea for GSOC ?
>
> GORA-225 and Gora Spark execution engine support in my mind. What about
> you ?
>
>
I am already now tied in to GSoC on other projects, namely Any23, OODT and
Nutch.
Are you able to mentor a project this year in Gora? If not then I cannot
commit to something I will have no time to do :(


Re: Integrating gora-gradle-plugin into release management

2015-03-05 Thread Lewis John Mcgibbney
Hi Damien,

On Thu, Mar 5, 2015 at 6:22 PM,  wrote:

>
> Re: Integrating gora-gradle-plugin into release management
> 7456 by: Damien Raude-Morvan
>     7462 by: Lewis John Mcgibbney
>     7463 by: Lewis John Mcgibbney
> 7464 by: Damien Raude-Morvan
>
>
> You can allocate permissions to "drazzib" account on cwiki.
>
> DONE.
Thanks
Lewis


Re: Integrating gora-gradle-plugin into release management

2015-03-05 Thread Lewis John Mcgibbney
Actually, can anyone that would like write permissions to the Gora cwiki
instance please provide their username on this thread?
Thank you
Lewis

On Thu, Mar 5, 2015 at 9:28 AM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:

> Whats you wiki username?
> Thanks
> Lewis
>
> On Thu, Mar 5, 2015 at 5:14 AM, Damien Raude-Morvan 
> wrote:
>
>> 2015-03-03 19:54 GMT+01:00 Lewis John Mcgibbney <
>> lewis.mcgibb...@gmail.com>:
>> > Hi Damien,
>>
>> Hi Lewis !
>>
>> > Can you please comment on what you see happening for initiating a
>> release
>> > management procedure for your Gradle plugin?
>> > We have a release management procedure [0] however it does not currently
>> > accommodate the Gradle plugin release.
>> > Any help to bake this in would be grand.
>>
>> Of course, I can provide guidance for releasing this plugin !
>> AFAIK, I'm not allowed to edit this Confluence wiki. What's prefered
>> way to get the right permissions ?
>>
>> Regards,
>> --
>> Damien
>>
>
>
>
> --
> *Lewis*
>



-- 
*Lewis*


Re: Integrating gora-gradle-plugin into release management

2015-03-05 Thread Lewis John Mcgibbney
Whats you wiki username?
Thanks
Lewis

On Thu, Mar 5, 2015 at 5:14 AM, Damien Raude-Morvan 
wrote:

> 2015-03-03 19:54 GMT+01:00 Lewis John Mcgibbney  >:
> > Hi Damien,
>
> Hi Lewis !
>
> > Can you please comment on what you see happening for initiating a release
> > management procedure for your Gradle plugin?
> > We have a release management procedure [0] however it does not currently
> > accommodate the Gradle plugin release.
> > Any help to bake this in would be grand.
>
> Of course, I can provide guidance for releasing this plugin !
> AFAIK, I'm not allowed to edit this Confluence wiki. What's prefered
> way to get the right permissions ?
>
> Regards,
> --
> Damien
>



-- 
*Lewis*


[jira] [Comment Edited] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-03-03 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345902#comment-14345902
 ] 

Lewis John McGibbney edited comment on GORA-416 at 3/3/15 10:45 PM:


OK so I've debugged this right through on the 
[FetcherJob|https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java]
 task.
What is happening here is that we iterate through the UNION structure of the 
[protocolStatus 
field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] 
of the Nutch WebPage object, with the field value at position 1 being created 
as **protocolStatus_UnionIndex** and a [subColumn being created as we 
desire|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L590].
 
However once this has been done, when we come to the field value at position 1 
we use recursion on 
[addOrUpdateField|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L598]
 where we then encounter the [RECORD which is the actual 
protocolStatus|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L506].
 This one contains the actual value.
What happens now is that we [add this as a normal 
column|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L512]
 instead of the super column that is is defined as. This is what results in the 
InvalidRequestException.
Patch for master branch coming up. I will try to provide a test case as well 
which replicated the problem.


was (Author: lewismc):
OK so I've debugged this right through on the 
[FetcherJob|https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java]
 task.
What is happening here is that we iterate through the UNION structure of the 
[protocolStatus 
field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] 
of the Nutch WebPage object, with the field value at position 1 being created 
as **protocolStatus_UnionIndex** and a [subColumn being created as we 
desire|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L590].
 
However once this has been done, when we come to the field value at position 1 
we use recursion on 
[addOrUpdateField|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L598]
 where we then encounter the [RECORD which is the actual 
protocolStatus|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L506].
 This one contains the actual value.
What happens now is that we [add this as a normal 
column|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L512]
 instead of the super column that is is defined as.

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>    Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread F

[jira] [Commented] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-03-03 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345902#comment-14345902
 ] 

Lewis John McGibbney commented on GORA-416:
---

OK so I've debugged this right through on the 
[FetcherJob|https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java]
 task.
What is happening here is that we iterate through the UNION structure of the 
[protocolStatus 
field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] 
of the Nutch WebPage object, with the field value at position 1 being created 
as **protocolStatus_UnionIndex** and a [subColumn being created as we 
desire|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L590].
 
However once this has been done, when we come to the field value at position 1 
we use recursion on 
[addOrUpdateField|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L598]
 where we then encounter the [RECORD which is the actual 
protocolStatus|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L506].
 This one contains the actual value.
What happens now is that we [add this as a normal 
column|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L512]
 instead of the super column that is is defined as.

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.ad

[jira] [Commented] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-03-03 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345699#comment-14345699
 ] 

Lewis John McGibbney commented on GORA-416:
---

This is happening on the [protocolStatus 
field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] 
of the Nutch WebPage object

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> --
>
> Key: GORA-416
> URL: https://issues.apache.org/jira/browse/GORA-416
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-cassandra
>Affects Versions: 0.6
> Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 0.6.1
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>   at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>   at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>   at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>   at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>   at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>   at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>   at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>   at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: InvalidRequestException(why:supercolumn parameter is not opti

[jira] [Created] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

2015-03-03 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-416:
-

 Summary: Error when populating data into Cassandra super column - 
InvalidRequestException(why:supercolumn parameter is not optional for super CF 
sc
 Key: GORA-416
 URL: https://issues.apache.org/jira/browse/GORA-416
 Project: Apache Gora
  Issue Type: Bug
  Components: gora-cassandra
Affects Versions: 0.6
 Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
Cassandra 2.0.7
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Blocker
 Fix For: 0.6.1


Error when populating data into Cassandra super column.

{code}
lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
fetch 1425410774-370456822
FetcherJob: starting at 2015-03-03 11:27:57
FetcherJob: batchId: 1425410774-370456822
FetcherJob: threads: 10
FetcherJob: parsing: false
FetcherJob: resuming: false
FetcherJob : timelimit set for : -1
2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
SCDynamicStore
Using queue mode : byHost
Fetcher: threads: 10
QueueFeeder finished: total 1 records. Hit by time limit :0
fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
-finishing thread FetcherThread1, activeThreads=1
-finishing thread FetcherThread2, activeThreads=1
-finishing thread FetcherThread3, activeThreads=1
-finishing thread FetcherThread4, activeThreads=1
-finishing thread FetcherThread5, activeThreads=1
-finishing thread FetcherThread6, activeThreads=1
-finishing thread FetcherThread7, activeThreads=1
-finishing thread FetcherThread8, activeThreads=1
Fetcher: throughput threshold: -1
-finishing thread FetcherThread9, activeThreads=1
Fetcher: throughput threshold sequence: 5
-finishing thread FetcherThread0, activeThreads=0
0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs in 
0 queues
-activeThreads=0
me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
InvalidRequestException(why:supercolumn parameter is not optional for super CF 
sc)
at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
at 
me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at 
me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
at 
org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
at 
org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
at 
org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
at 
org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
at 
org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
at 
org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
at 
org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
at 
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: InvalidRequestException(why:supercolumn parameter is not optional 
for super CF sc)
at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result$batch_mutate_resultStandardScheme.read(Cassandra.java:28082)
at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result$batch_mutate_resultStandardScheme.read(Cassandra.java:28068)
at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:28002)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1060)
at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1046)
at 
me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
at 
me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
at

Integrating gora-gradle-plugin into release management

2015-03-03 Thread Lewis John Mcgibbney
Hi Damien,
Can you please comment on what you see happening for initiating a release
management procedure for your Gradle plugin?
We have a release management procedure [0] however it does not currently
accommodate the Gradle plugin release.
Any help to bake this in would be grand.
Thanks
Lewis

[0]
https://cwiki.apache.org/confluence/display/GORA/Apache+Gora+Release+Procedure+HOW_TO
-- 
*Lewis*


Re: Does anyone hear about Hibernate OGM?

2015-03-03 Thread Lewis John Mcgibbney
Hi Alparslan,

On Sat, Feb 28, 2015 at 4:33 AM, 
wrote:

>
> It seems Hibernate has also an object mapper for NoSQL:
> http://hibernate.org/ogm/
>
> Has any of us tried it?
>

I knew that they had this a while back but I have not used it. What about
you?


>
> The "wide range of backends" part in the website:
>
> > Wide range of backends
> >
> > OGM talks to NoSQL backends via store-specific dialects. Currently there
> > is support for
> >
> >-
> >
> >*Key/Value*: Infinispan, Ehcache
> >-
> >
> >*Document*: MongoDB
> >-
> >
> >*Graph*: Neo4j
> >
> > Your favorite NoSQL store isn’t listed here? We’d love to get your help
> >  for adding it.
> >
>
>


[jira] [Updated] (GORA-330) gora-gradle-plugin

2015-03-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-330:
--
Fix Version/s: (was: 0.7)
   0.6.1

> gora-gradle-plugin
> --
>
> Key: GORA-330
> URL: https://issues.apache.org/jira/browse/GORA-330
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: plugins
>    Reporter: Lewis John McGibbney
>Assignee: Damien Raude-Morvan
> Fix For: 0.6.1
>
>
> Assigned to Damien for karma :)
> This issue will track the contribution of the gora-gradle-plugin to our SCM.
> Thanks [~drazzib], is it possible for you to create a PR adding this as a new 
> module named *gora-gradle*?
> Thank you very much in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-410) Change logging behavior to pass exception object to LOG methods

2015-03-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-410:
--
Fix Version/s: (was: 0.7)
   0.6.1

> Change logging behavior to pass exception object to LOG methods
> ---
>
> Key: GORA-410
> URL: https://issues.apache.org/jira/browse/GORA-410
> Project: Apache Gora
>  Issue Type: Improvement
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
> Fix For: 0.6.1
>
> Attachments: exception_logging.patch
>
>
> Throughout the codebase, exceptions are reported by logging 
> {{e.getStackTrace().toString()}}. As {{getStackTrace}} returns an array and 
> Java's array {{.toString}} method just returns the object ID, this means that 
> the stack traces are effectively lost. Instead the exception should be passed 
> to the logger in its original form as the second argument, SLF4J does the 
> right thing with the stack trace.
> This problem was tackled in GORA-230 for AccumuloStore. The attached patch 
> corrects all instances of the problem in the current trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-384) Provide documentation on Gora Shims layer

2015-03-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-384.
---
Resolution: Fixed

http://gora.apache.org/current/gora-shims.html

> Provide documentation on Gora Shims layer
> -
>
> Key: GORA-384
> URL: https://issues.apache.org/jira/browse/GORA-384
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: documentation, gora-shims-distribution, 
> gora-shims-hadoop, gora-shims-hadoop1, gora-shims-hadoop2
>Affects Versions: 0.5
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
>
> We should really back up GORA-346 with some documentation as to how it can 
> and should be used, what artifacts and what versions are available, etc.,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-415) hadoop-client dependency should be optional in gora-core

2015-03-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-415.
---
Resolution: Fixed

> hadoop-client dependency should be optional in gora-core
> 
>
> Key: GORA-415
> URL: https://issues.apache.org/jira/browse/GORA-415
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-core
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>Assignee: Henry Saputra
>Priority: Blocker
> Fix For: 0.6.1
>
>
> We found that the Hadoop Shims were not working as expected when attempting 
> the Gora dependency upgrade over on Nutch 2.X.
> The relevant upgrade issue is NUTCH-1946
> The relevant discussion thread is 
> [here|http://www.mail-archive.com/dev%40gora.apache.org/msg05752.html].
> We had a true dependent reliance upon hadoop-client v 2.5.2 within gora-core 
> which is not correct. The reliance should be optional meaning that shims work 
> with the true expectation that we can switch between Hadoop 1 and 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-415) hadoop-client dependency should be optional in gora-core

2015-02-27 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341141#comment-14341141
 ] 

Lewis John McGibbney commented on GORA-415:
---

+1 I can get this done over the weekend.
Please look at the roadmap for 0.6.1 on the Jira tracker.
Il also flush out docs for the shims over the weekend as I am using them
inside of Chukwa as well and they will be used by that community.
Thanks for debugging this Henry.
Lewis




-- 
*Lewis*


> hadoop-client dependency should be optional in gora-core
> 
>
> Key: GORA-415
> URL: https://issues.apache.org/jira/browse/GORA-415
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-core
>Affects Versions: 0.6
>Reporter: Lewis John McGibbney
>Assignee: Henry Saputra
>Priority: Blocker
> Fix For: 0.6.1
>
>
> We found that the Hadoop Shims were not working as expected when attempting 
> the Gora dependency upgrade over on Nutch 2.X.
> The relevant upgrade issue is NUTCH-1946
> The relevant discussion thread is 
> [here|http://www.mail-archive.com/dev%40gora.apache.org/msg05752.html].
> We had a true dependent reliance upon hadoop-client v 2.5.2 within gora-core 
> which is not correct. The reliance should be optional meaning that shims work 
> with the true expectation that we can switch between Hadoop 1 and 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (GORA-384) Provide documentation on Gora Shims layer

2015-02-27 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340852#comment-14340852
 ] 

Lewis John McGibbney commented on GORA-384:
---

I'll hammer this one out soon folks. A good understanding of how the Shims work 
is now under my belt.


> Provide documentation on Gora Shims layer
> -
>
> Key: GORA-384
> URL: https://issues.apache.org/jira/browse/GORA-384
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: documentation, gora-shims-distribution, 
> gora-shims-hadoop, gora-shims-hadoop1, gora-shims-hadoop2
>Affects Versions: 0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
>
> We should really back up GORA-346 with some documentation as to how it can 
> and should be used, what artifacts and what versions are available, etc.,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-384) Provide documentation on Gora Shims layer

2015-02-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-384:
--
Fix Version/s: (was: 0.7)
   0.6.1

> Provide documentation on Gora Shims layer
> -
>
> Key: GORA-384
> URL: https://issues.apache.org/jira/browse/GORA-384
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: documentation, gora-shims-distribution, 
> gora-shims-hadoop, gora-shims-hadoop1, gora-shims-hadoop2
>Affects Versions: 0.5
>    Reporter: Lewis John McGibbney
>Assignee: Talat UYARER
> Fix For: 0.6.1
>
>
> We should really back up GORA-346 with some documentation as to how it can 
> and should be used, what artifacts and what versions are available, etc.,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (GORA-384) Provide documentation on Gora Shims layer

2015-02-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned GORA-384:
-

Assignee: Lewis John McGibbney  (was: Talat UYARER)

> Provide documentation on Gora Shims layer
> -
>
> Key: GORA-384
> URL: https://issues.apache.org/jira/browse/GORA-384
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: documentation, gora-shims-distribution, 
> gora-shims-hadoop, gora-shims-hadoop1, gora-shims-hadoop2
>Affects Versions: 0.5
>    Reporter: Lewis John McGibbney
>    Assignee: Lewis John McGibbney
> Fix For: 0.6.1
>
>
> We should really back up GORA-346 with some documentation as to how it can 
> and should be used, what artifacts and what versions are available, etc.,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-415) hadoop-client dependency should be optional in gora-core

2015-02-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-415:
--
Priority: Blocker  (was: Major)

> hadoop-client dependency should be optional in gora-core
> 
>
> Key: GORA-415
> URL: https://issues.apache.org/jira/browse/GORA-415
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-core
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>Assignee: Henry Saputra
>Priority: Blocker
> Fix For: 0.6.1
>
>
> We found that the Hadoop Shims were not working as expected when attempting 
> the Gora dependency upgrade over on Nutch 2.X.
> The relevant upgrade issue is NUTCH-1946
> The relevant discussion thread is 
> [here|http://www.mail-archive.com/dev%40gora.apache.org/msg05752.html].
> We had a true dependent reliance upon hadoop-client v 2.5.2 within gora-core 
> which is not correct. The reliance should be optional meaning that shims work 
> with the true expectation that we can switch between Hadoop 1 and 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-415) hadoop-client dependency should be optional in gora-core

2015-02-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-415:
--
Description: 
We found that the Hadoop Shims were not working as expected when attempting the 
Gora dependency upgrade over on Nutch 2.X.
The relevant upgrade issue is NUTCH-1946
The relevant discussion thread is 
[here|http://www.mail-archive.com/dev%40gora.apache.org/msg05752.html].
We had a true dependent reliance upon hadoop-client v 2.5.2 within gora-core 
which is not correct. The reliance should be optional meaning that shims work 
with the true expectation that we can switch between Hadoop 1 and 2.

> hadoop-client dependency should be optional in gora-core
> 
>
> Key: GORA-415
> URL: https://issues.apache.org/jira/browse/GORA-415
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-core
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>Assignee: Henry Saputra
> Fix For: 0.6.1
>
>
> We found that the Hadoop Shims were not working as expected when attempting 
> the Gora dependency upgrade over on Nutch 2.X.
> The relevant upgrade issue is NUTCH-1946
> The relevant discussion thread is 
> [here|http://www.mail-archive.com/dev%40gora.apache.org/msg05752.html].
> We had a true dependent reliance upon hadoop-client v 2.5.2 within gora-core 
> which is not correct. The reliance should be optional meaning that shims work 
> with the true expectation that we can switch between Hadoop 1 and 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-415) hadoop-client dependency should be optional in gora-core

2015-02-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-415:
--
Fix Version/s: 0.6.1

> hadoop-client dependency should be optional in gora-core
> 
>
> Key: GORA-415
> URL: https://issues.apache.org/jira/browse/GORA-415
> Project: Apache Gora
>  Issue Type: Bug
>  Components: gora-core
>Affects Versions: 0.6
>    Reporter: Lewis John McGibbney
>Assignee: Henry Saputra
> Fix For: 0.6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (GORA-415) hadoop-client dependency should be optional in gora-core

2015-02-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-415:
-

 Summary: hadoop-client dependency should be optional in gora-core
 Key: GORA-415
 URL: https://issues.apache.org/jira/browse/GORA-415
 Project: Apache Gora
  Issue Type: Bug
  Components: gora-core
Affects Versions: 0.6
Reporter: Lewis John McGibbney
Assignee: Henry Saputra






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (GORA-414) Upgrade to Accumulo 1.6.X

2015-02-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-414:
-

 Summary: Upgrade to Accumulo 1.6.X
 Key: GORA-414
 URL: https://issues.apache.org/jira/browse/GORA-414
 Project: Apache Gora
  Issue Type: Bug
  Components: gora-accumulo
Reporter: Lewis John McGibbney
 Fix For: 0.7


Accumulo 1.6.X is making releases now and we should have this as the current 
version for Gora.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Dynamically generating HBase columns

2015-02-26 Thread Lewis John Mcgibbney
Hi Alfonso,

On Tue, Feb 24, 2015 at 10:27 PM,  wrote:

>
> In my use cases I always need a mix between static and dynamic columns.
> In my first week I tried to mix a Map over a column family overlapped with
> static columns. Didn't work because Gora was not prepared for that (and
> indeed needs thinking about it further).
>

Yeah. I've logged the following focus to deal with it
https://issues.apache.org/jira/browse/GORA-413


>
> What I do is separate the static columns in one column family (or serveral)
> from the dynamic stuff (that goes in a map). One Map is mapped to one
> column family in which each column:value is key=>value in the map.
> I have several maps depending on my needs, but can be just one big one with
> key=column.
>

Can you please show this graphically so I am absolutely clear on what you
are doing?


>
> What I don't fully understand is the timestamp you talk about, since we
> don't handle HBase timestamps. Do you specifically need it?
>

Yes, please read comment on GORA-413


>
> I'm not quite sure if I answer you :S
>

We will clarify it soon. Don;t worry ;)


>
> Something important to ask is much columns will you store in the column
> family?
>

Well dynamic columns will be added with every incoming chunk of data.


> Since we removed the StateManager, when you modify a map it deletes the
> column familiy and sends all the data again to be written (
>
> https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L289
> ),
> so adding/removing just one column can be quite killing when persisting
> several huge maps. About what volume and write pattern are we talking?
>

The volume of data will not be so large however it is concerning that
entire column families are deleted and re-written. It seems like a waste of
time and resources which we should address in an effort to make this a more
efficient process.
Thanks, lets take the discussion over to GORA-413


[jira] [Commented] (GORA-413) Support creation of dynamic columns within Gora datastore mapping designs

2015-02-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339073#comment-14339073
 ] 

Lewis John McGibbney commented on GORA-413:
---

I suggest that we start with HBase and move to other datastores based on us 
conquering this one first.

> Support creation of dynamic columns within Gora datastore mapping designs
> -
>
> Key: GORA-413
> URL: https://issues.apache.org/jira/browse/GORA-413
> Project: Apache Gora
>  Issue Type: New Feature
>  Components: gora-hbase
>Affects Versions: 0.6
>Reporter: Lewis John McGibbney
> Fix For: 0.7
>
>
> The conversation taking place on [dynamically generating HBase 
> columns|http://www.mail-archive.com/dev%40gora.apache.org/msg05754.html] has 
> raised an issue that new functionality needs to be added in order to achieve 
> this.
> The main driver for this issue coming to light is that Chukwa logs need to 
> dynamically create many many columns over time directly dependent on the 
> number of data chunks we get. Each data chunk has a [Sequence ID], this 
> sequenceID should be the column name.
> The table design will look like this
> {code}
> Row Key: [Invert Date]:[Data Type]:[Primary Key]
> Column Family: log
> Column Name: [Sequence ID]
> Timestamp: [log entry timestamp]
> Example:
> Row Key: 2132013102:TT:host1.example.com
> Column Family: log
> Column Name: 1230
> Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
> Timestamp: 1358942490
> {code}
> The inverted date allow the table to be partitioned by hour or day of the 
> month or month more easily.
> The usage of column name for consecutive sequence to allow fast retrieval in 
> a linear scan. This format is typically good for retrieve a hour worth of 
> logs fast for a node. Hence, if we are doing batch scanning of the table in a 
> rolling window via map reduce job at every hour interval, we get a even 
> spread the work load to multiple map reduce tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (GORA-413) Support creation of dynamic columns within Gora datastore mapping designs

2015-02-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-413:
-

 Summary: Support creation of dynamic columns within Gora datastore 
mapping designs
 Key: GORA-413
 URL: https://issues.apache.org/jira/browse/GORA-413
 Project: Apache Gora
  Issue Type: New Feature
  Components: gora-hbase
Affects Versions: 0.6
Reporter: Lewis John McGibbney
 Fix For: 0.7


The conversation taking place on [dynamically generating HBase 
columns|http://www.mail-archive.com/dev%40gora.apache.org/msg05754.html] has 
raised an issue that new functionality needs to be added in order to achieve 
this.
The main driver for this issue coming to light is that Chukwa logs need to 
dynamically create many many columns over time directly dependent on the number 
of data chunks we get. Each data chunk has a [Sequence ID], this sequenceID 
should be the column name.

The table design will look like this

{code}

Row Key: [Invert Date]:[Data Type]:[Primary Key]
Column Family: log
Column Name: [Sequence ID]
Timestamp: [log entry timestamp]

Example:

Row Key: 2132013102:TT:host1.example.com
Column Family: log
Column Name: 1230
Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
Timestamp: 1358942490
{code}

The inverted date allow the table to be partitioned by hour or day of the month 
or month more easily.
The usage of column name for consecutive sequence to allow fast retrieval in a 
linear scan. This format is typically good for retrieve a hour worth of logs 
fast for a node. Hence, if we are doing batch scanning of the table in a 
rolling window via map reduce job at every hour interval, we get a even spread 
the work load to multiple map reduce tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[EARLY WARNING] Possible Major Bug in gora-cassandra

2015-02-25 Thread Lewis John Mcgibbney
Hi Folks,
Several threads have popped up over on the Nutch mailing lists regarding
use of gora-cassandra 0.5 within Nutch 2.3.

http://www.mail-archive.com/user%40nutch.apache.org/msg13228.html
http://www.mail-archive.com/user%40nutch.apache.org/msg13235.html
http://www.mail-archive.com/user%40nutch.apache.org/msg13237.html
http://www.mail-archive.com/user%40nutch.apache.org/msg13250.html

I think we can expect a 0.6.1 release pretty soon if this is discovered to
be a major bug.
I have not been using gora-cassandra for a number of months (2 or so), so I
am not immediately sure right now what is wrong.
We appear to be loosing data between ParserJob and FetcherJob states with 0
Map input records being provided to the ParserJob Map Reduce framework.
Any help from this team on deploying a test configuration and testing would
be highly appreciated.
Suggested software stack is as follows

Nutch 2.4-SNAPSHOT (HEAD)
Gora 0.5, Gora Cassandra 0.5
Cassandra 2.0.2

Thanks
Lewis


-- 
*Lewis*


Dynamically generating HBase columns

2015-02-24 Thread Lewis John Mcgibbney
Hi Folks,
I am currently supercharging persistence in Apache Chukwa [0] with Gora,
progress can be tracked in Jira [1].
The issue I run in to, is that the required HBase schema looks as follows

Row Key: [Invert Date]:[Data Type]:[Primary Key]
Column Family: log
Column Name: [Sequence ID]
Timestamp: [log entry timestamp]

Example:

Row Key: 2132013102:TT:host1.example.com
Column Family: log
Column Name: 1230
Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
Timestamp: 1358942490

The issue here is therefore that there will be dynamically generated
columns, and the column names needs to be the field 'sequenceID', which is
coming from the data bean itself.

I *think* that this causes a conflict between our current mapping workflow
where you 1) create data model in JSON, 2) create mapping file/datastore
schema, 3) compile JSON... and so forth. The data is then mapped into the
PREDEFINED datastore specific schema.

The proposed change in workflow would involve 1) create data model in JSON,
2) create mapping file/datastore schema, 3) compile JSON... and so forth.
The data is then mapped into the PREDEFINED datastore specific schema AND
ALSO DYNAMIC FIELDS CAN BE GENERATED ON THE FLY.

Has anyone else required dynamic columns for any datastore?

I think that this is very handy and I would like to see what you guys think.

Thanks

[0] http://chukwa.apache.org
[1] https://issues.apache.org/jira/browse/CHUKWA-734

-- 
*Lewis*


Workings of Hadoop Shims

2015-02-22 Thread Lewis John Mcgibbney
Hi Folks,
I'm kicking off this overdue thread to obtain good understanding of exactly
whats going on with the Hadoop Shims. The documentation is lacking at the
moment and I am therefore putting time in to rectifying this.
My humble beginnings are in progress below
http://gora.apache.org/current/gora-shims.html

Scenario - Upgrade Nutch 2.3.1-SNAPSHOT to Gora 0.6
Jira Issue - https://issues.apache.org/jira/browse/NUTCH-1946
Observations - From my initial analysis of the current state of the Shims,
here are some initial observations

   - gora-shims-distribution relies upon gora-shims-hadoop,
   gora-shims-hadoop1 and gora-shims-hadoop2
   - gora-shims-hadoop provides a parent for gora-shims-hadoop1 and
   gora-shims-hadoop2, however it also had direct dependencies upon the
   following
   - org.apache.hadoop:hadoop-client:jar:2.5.2:compile
  - org.apache.hadoop:hadoop-hdfs:jar:2.5.2:compile
  - org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.5.2:compile
  - org.apache.hadoop:hadoop-yarn-api:jar:2.5.2:compile
  - org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.5.2:compile
  -
  org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.5.2:compile
  - org.apache.hadoop:hadoop-annotations:jar:2.5.2:compile


   - As stated above, both gora-shims-hadoop1 and gora-shims-hadoop2 depend
   upon gora-shims-hadoop with the difference being that gora-shims-hadoop1
   then defines hadoop 1.X dependencies.

Problems - I understand that we have upgraded to Hadoop 2.5.2 by default.
This is great. What I am failing to get a grasp on however is exactly how
we provide guidance on upgrade to Gora 0.6 without upgrades from Hadoop
1.2.X --> 2.5.X?

Bearing in mind that gora-core depends upon gora-shims-hadoop therefore
Hadoop 2.5.2 dependencies are automatically fetched in a transitive fashion
whenever we with to upgrade gora-core dependency from 0.5 --> 0.6.

I am going to experiment with using a bunch of exclusions in my pom.xml
under the gora-shims-hadoop dependency e.g exclude all above Hadoop
dependencies, then explicitly add the gora-shims-hadoop1 dependency.

What is making this worse, is that I cannot create profiles for this
upgrade as I would be able to do in a Maven project because I am working
with Ant + Ivy.

Any thoughts would be very much appreciated. Essentially whatever we
discuss here is creation the foundation for the Gora Shims documentation so
it would be very much appreciated.

Thanks

Lewis

-- 
*Lewis*


[jira] [Commented] (GORA-412) Consider location of @SuppressWarnings("all") in compiled classes

2015-02-22 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14332410#comment-14332410
 ] 

Lewis John McGibbney commented on GORA-412:
---

+1 [~renato2099], I would also be in favor for that suggestion. 

> Consider location of @SuppressWarnings("all") in compiled classes
> -
>
> Key: GORA-412
> URL: https://issues.apache.org/jira/browse/GORA-412
> Project: Apache Gora
>  Issue Type: Improvement
>  Components: gora-compiler
>Affects Versions: 0.6
>Reporter: Lewis John McGibbney
> Fix For: 0.7
>
>
> Right now we silence Javac to any potential warnings by adding the following 
> to compiled classes
> {code}
> /**
>  * Autogenerated by Avro
>  * 
>  * DO NOT EDIT DIRECTLY
>  */
> package org.apache.gora;  
> @SuppressWarnings("all")
> ...
> {code}
> This means that the Javadoc associated with the generated class is not 
> interpreted correctly by clients such as Eclipse.
> I propose to either
>  * remove the SupressWarnings altogether, or
>  * have it generated underneath the Javadoc string.
> Any thoughts folks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (GORA-410) Change logging behavior to pass exception object to LOG methods

2015-02-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-410.
---
Resolution: Fixed

commit 65d6c7ac29d2cc041d067e5e7f0857936a7529d7
Author: Lewis John McGibbney 
Date:   Sun Feb 22 13:48:38 2015 -0800

GORA-410 Change logging behavior to pass exception object to LOG methods

Thank you very much [~gerhard.gossen], great patch.

> Change logging behavior to pass exception object to LOG methods
> ---
>
> Key: GORA-410
> URL: https://issues.apache.org/jira/browse/GORA-410
> Project: Apache Gora
>  Issue Type: Improvement
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
> Fix For: 0.7
>
> Attachments: exception_logging.patch
>
>
> Throughout the codebase, exceptions are reported by logging 
> {{e.getStackTrace().toString()}}. As {{getStackTrace}} returns an array and 
> Java's array {{.toString}} method just returns the object ID, this means that 
> the stack traces are effectively lost. Instead the exception should be passed 
> to the logger in its original form as the second argument, SLF4J does the 
> right thing with the stack trace.
> This problem was tackled in GORA-230 for AccumuloStore. The attached patch 
> corrects all instances of the problem in the current trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-410) Change logging behavior to pass exception object to LOG methods

2015-02-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-410:
--
Fix Version/s: 0.7

> Change logging behavior to pass exception object to LOG methods
> ---
>
> Key: GORA-410
> URL: https://issues.apache.org/jira/browse/GORA-410
> Project: Apache Gora
>  Issue Type: Improvement
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
> Fix For: 0.7
>
> Attachments: exception_logging.patch
>
>
> Throughout the codebase, exceptions are reported by logging 
> {{e.getStackTrace().toString()}}. As {{getStackTrace}} returns an array and 
> Java's array {{.toString}} method just returns the object ID, this means that 
> the stack traces are effectively lost. Instead the exception should be passed 
> to the logger in its original form as the second argument, SLF4J does the 
> right thing with the stack trace.
> This problem was tackled in GORA-230 for AccumuloStore. The attached patch 
> corrects all instances of the problem in the current trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (GORA-410) Change logging behavior to pass exception object to LOG methods

2015-02-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-410:
--
Issue Type: Improvement  (was: Bug)

> Change logging behavior to pass exception object to LOG methods
> ---
>
> Key: GORA-410
> URL: https://issues.apache.org/jira/browse/GORA-410
> Project: Apache Gora
>  Issue Type: Improvement
>Reporter: Gerhard Gossen
>Assignee: Gerhard Gossen
> Fix For: 0.7
>
> Attachments: exception_logging.patch
>
>
> Throughout the codebase, exceptions are reported by logging 
> {{e.getStackTrace().toString()}}. As {{getStackTrace}} returns an array and 
> Java's array {{.toString}} method just returns the object ID, this means that 
> the stack traces are effectively lost. Instead the exception should be passed 
> to the logger in its original form as the second argument, SLF4J does the 
> right thing with the stack trace.
> This problem was tackled in GORA-230 for AccumuloStore. The attached patch 
> corrects all instances of the problem in the current trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (GORA-412) Consider location of @SuppressWarnings("all") in compiled classes

2015-02-22 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-412:
-

 Summary: Consider location of @SuppressWarnings("all") in compiled 
classes
 Key: GORA-412
 URL: https://issues.apache.org/jira/browse/GORA-412
 Project: Apache Gora
  Issue Type: Improvement
  Components: gora-compiler
Affects Versions: 0.6
Reporter: Lewis John McGibbney
 Fix For: 0.7


Right now we silence Javac to any potential warnings by adding the following to 
compiled classes
{code}
/**
 * Autogenerated by Avro
 * 
 * DO NOT EDIT DIRECTLY
 */
package org.apache.gora;  
@SuppressWarnings("all")
...
{code}

This means that the Javadoc associated with the generated class is not 
interpreted correctly by clients such as Eclipse.
I propose to either
 * remove the SupressWarnings altogether, or
 * have it generated underneath the Javadoc string.

Any thoughts folks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    2   3   4   5   6   7   8   9   10   11   >