RE: Question about RDD cache, unpersist, materialization

2014-06-10 Thread Nick Pentreath
If you want to force materialization use .count() Also if you can simply don't unpersist anything, unless you really need to free the memory  — Sent from Mailbox On Wed, Jun 11, 2014 at 5:13 AM, innowireless TaeYun Kim wrote: > BTW, it is possible that rdd.first() does not compute the whole p

Re: Is there anyone who can explain why the function of ALS.train give different shuffle results when execute the same transformation flatMap

2014-06-25 Thread Nick Pentreath
How many users and items do you have? Each iteration will first iterate through users and then items, so each iteration of ALS actually ends up having 2 flatMap operations. I'd assume that you have many more users than items (or vice versa), which is why one of the operations generates more data.

Re: Using CQLSSTableWriter to batch load data from Spark to Cassandra.

2014-06-25 Thread Nick Pentreath
can you not use a Cassandra OutputFormat? Seems they have BulkOutputFormat. An example of using it with Hadoop is here: http://shareitexploreit.blogspot.com/2012/03/bulkloadto-cassandra-with-hadoop.html Using it with Spark will be similar to the examples: https://github.com/apache/spark/blob/maste

Re: Using CQLSSTableWriter to batch load data from Spark to Cassandra.

2014-06-25 Thread Nick Pentreath
). > > We could not get round that issue. (Any pointers in that direction?) > > That's why I'm trying the direct CQLSSTableWriter way but it looks blocked > as well. > > -kr, Gerard. > > > > > On Wed, Jun 25, 2014 at 8:57 PM, Nick Pentreath > wrote:

Re: ElasticSearch enrich

2014-06-26 Thread Nick Pentreath
You can just add elasticsearch-hadoop as a dependency to your project to user the ESInputFormat and ESOutputFormat ( https://github.com/elasticsearch/elasticsearch-hadoop). Some other basics here: http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/spark.html For testing, yes I thin

Re: numpy + pyspark

2014-06-27 Thread Nick Pentreath
I've not tried this - but numpy is a tricky and complex package with many dependencies on Fortran/C libraries etc. I'd say by the time you figure out correctly deploying numpy in this manner, you may as well have just built it into your cluster bootstrap process, or PSSH install it on each node...

Re: numpy + pyspark

2014-06-27 Thread Nick Pentreath
t; The dependencies would get tricky but I think this is the sort of situation > it's built for. > > > On 6/27/14, 11:06 AM, Avishek Saha wrote: > > I too felt the same Nick but I don't have root privileges on the cluster, > unfortunately. Are there any alternatives? > &

Re: Sample datasets for MLlib and Graphx

2014-07-03 Thread Nick Pentreath
Take a look at Kaggle competition datasets - https://www.kaggle.com/competitions For svm there are a couple of ad click prediction datasets of pretty large size. For graph stuff the SNAP has large network data: https://snap.stanford.edu/data/ — Sent from Mailbox On Thu, Jul 3, 2014 at 3

Re: Sample datasets for MLlib and Graphx

2014-07-03 Thread Nick Pentreath
me across which are easily publicly available (very happy to be proved wrong about this though :) — Sent from Mailbox On Thu, Jul 3, 2014 at 4:39 PM, AlexanderRiggers wrote: > Nick Pentreath wrote >> Take a look at Kaggle competition datasets >> - https://www.kaggle.com/competi

Re: DynamoDB input source

2014-07-04 Thread Nick Pentreath
You should be able to use DynamoDBInputFormat (I think this should be part of AWS libraries for Java) and create a HadoopRDD from that. On Fri, Jul 4, 2014 at 8:28 AM, Ian Wilkinson wrote: > Hi, > > I noticed mention of DynamoDB as input source in > > http://ampcamp.berkeley.edu/wp-content/uplo

Re: matchError:null in ALS.train

2014-07-04 Thread Nick Pentreath
Do you mind posting a little more detail about what your code looks like? It appears you might be trying to reference another RDD from within your RDD in the foreach. On Fri, Jul 4, 2014 at 2:28 AM, Honey Joshi wrote: > Original Message ---

Re: DynamoDB input source

2014-07-04 Thread Nick Pentreath
I’m going to be working with python primarily. Are you aware of > comparable boto support? > > ian > >> On 4 Jul 2014, at 16:32, Nick Pentreath wrote: >> >> You should be able to use DynamoDBInputFormat (I think this should be part >> of AWS libraries for Java) a

Re: DynamoDB input source

2014-07-04 Thread Nick Pentreath
Fri, Jul 4, 2014 at 8:51 AM, Ian Wilkinson wrote: > Excellent. Let me get browsing on this. > Huge thanks, > ian > On 4 Jul 2014, at 16:47, Nick Pentreath wrote: >> No boto support for that. >> >> In master there is Python support for loading Hadoop inputFormat.

Re: DynamoDB input source

2014-07-04 Thread Nick Pentreath
che-hadoop-hive-dynamodb > . > Unsure whether this represents the latest situation… > > ian > > > On 4 Jul 2014, at 16:58, Nick Pentreath wrote: > > I should qualify by saying there is boto support for dynamodb - but not > for the inputFormat. You could roll your own python

Re: taking top k values of rdd

2014-07-05 Thread Nick Pentreath
To make it efficient in your case you may need to do a bit of custom code to emit the top k per partition and then only send those to the driver. On the driver you can just top k the combined top k from each partition (assuming you have (object, count) for each top k list). — Sent from Mailbox

Re: taking top k values of rdd

2014-07-05 Thread Nick Pentreath
tyQueueMonoid.build to limit the sizes > of the queues). > but this still means i am sending k items per partition to my driver, so k > x p, while i only need k. > thanks! koert > On Sat, Jul 5, 2014 at 1:21 PM, Nick Pentreath > wrote: >> To make it efficient in your case

Re: How to parallelize model fitting with different cross-validation folds?

2014-07-05 Thread Nick Pentreath
For linear models the 3rd option is by far most efficient and I suspect what Evan is alluding to.  Unfortunately it's not directly possible with the classes in Mllib now so you'll have to roll your own using underlying sgd / bfgs primitives. — Sent from Mailbox On Sat, Jul 5, 2014 at 10:45 AM,

Re: Recommended pipeline automation tool? Oozie?

2014-07-11 Thread Nick Pentreath
You may look into the new Azkaban - which while being quite heavyweight is actually quite pleasant to use when set up. You can run spark jobs (spark-submit) using azkaban shell commands and pass paremeters between jobs. It supports dependencies, simple dags and scheduling with retries.  I'

Re: Recommended pipeline automation tool? Oozie?

2014-07-11 Thread Nick Pentreath
ot. Finally we almost rewrite > it totally. Don’t recommend it really. > > 发件人: Nick Pentreath > 答复: > 日期: 2014年7月11日 星期五 下午3:18 > 至: > 主题: Re: Recommended pipeline automation tool? Oozie? > > You may look into the new Azkaban - which while being quite heavyweight is

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-15 Thread Nick Pentreath
You could try the following: create a minimal project using sbt or Maven, add spark-streaming-twitter as a dependency, run sbt assembly (or mvn package) on that to create a fat jar (with Spark as provided dependency), and add that to the shell classpath when starting up. On Tue, Jul 15, 2014 at 9

Re: Count distinct with groupBy usage

2014-07-15 Thread Nick Pentreath
You can use .distinct.count on your user RDD. What are you trying to achieve with the time group by? — Sent from Mailbox On Tue, Jul 15, 2014 at 8:14 PM, buntu wrote: > Hi -- > New to Spark and trying to figure out how to do a generate unique counts per > page by date given this raw data: > ti

Re: Large scale ranked recommendation

2014-07-18 Thread Nick Pentreath
It is very true that making predictions in batch for all 1 million users against the 10k items will be quite onerous in terms of computation. I have run into this issue too in making batch predictions. Some ideas: 1. Do you really need to generate recommendations for each user in batch? How are y

Re: Large scale ranked recommendation

2014-07-18 Thread Nick Pentreath
Agree GPUs may be interesting for this kind of massively parallel linear algebra on reasonable size vectors. These projects might be of interest in this regard: https://github.com/BIDData/BIDMach https://github.com/BIDData/BIDMat https://github.com/dlwh/gust Nick On Fri, Jul 18, 2014 at 7:40 P

Re: NullPointerException When Reading Avro Sequence Files

2014-07-19 Thread Nick Pentreath
I got this working locally a little while ago when playing around with AvroKeyInputFile: https://gist.github.com/MLnick/5864741781b9340cb211 But not sure about AvroSequenceFile. Any chance you have an example datafile or records? On Sat, Jul 19, 2014 at 11:00 AM, Sparky wrote: > To be more sp

Re: Spark clustered client

2014-07-22 Thread Nick Pentreath
At the moment your best bet for sharing SparkContexts across jobs will be Ooyala job server: https://github.com/ooyala/spark-jobserver It doesn't yet support spark 1.0 though I did manage to amend it to get it to build and run on 1.0 — Sent from Mailbox On Wed, Jul 23, 2014 at 1:21 AM, Asaf La

Re: Workarounds for accessing sequence file data via PySpark?

2014-07-23 Thread Nick Pentreath
Load from sequenceFile for PySpark is in master and save is in this PR underway (https://github.com/apache/spark/pull/1338) I hope that Kan will have it ready to merge in time for 1.1 release window (it should be, the PR just needs a final review or two). In the meantime you can check out master

Re: iScala or Scala-notebook

2014-07-29 Thread Nick Pentreath
IScala itself seems to be a bit dead unfortunately. I did come across this today: https://github.com/tribbloid/ISpark On Fri, Jul 18, 2014 at 4:59 AM, ericjohnston1989 < ericjohnston1...@gmail.com> wrote: > Hey everyone, > > I know this was asked before but I'm wondering if there have since bee

Re: zip two RDD in pyspark

2014-07-29 Thread Nick Pentreath
parallelize uses the default Serializer (PickleSerializer) while textFile uses UTF8Serializer. You can get around this with index.zip(input_data._reserialize()) (or index.zip(input_data.map(lambda x: x))) (But if you try to just do this, you run into the issue with different number of partitions

Re: Issue with Spark on EC2 using spark-ec2 script

2014-08-07 Thread Nick Pentreath
Ryan, did you come right with this? I've just ran into the same problem on a new 1.0.0 cluster I spun up. The issue was that my app was not running against the Spark master, but in local mode (a default setting in my app that was a throwback from 0.9.1 and was overriding the spark defaults on the

Re: NoClassDefFoundError: org/codehaus/jackson/annotate/JsonClass with spark-submit

2014-08-07 Thread Nick Pentreath
I'm also getting this - Ryan we both seem to be running into this issue with elasticsearch-hadoop :) I tried spark.files.userClassPathFirst true on command line and that doesn;t work If I put it that line in spark/conf/spark-defaults it works but now I'm getting: java.lang.NoClassDefFoundError: o

Re: NoClassDefFoundError: org/codehaus/jackson/annotate/JsonClass with spark-submit

2014-08-08 Thread Nick Pentreath
By the way, for anyone using elasticsearch-hadoop, there is a fix for this here: https://github.com/elasticsearch/elasticsearch-hadoop/issues/239 Ryan - using the nightly snapshot build of 2.1.0.BUILD-SNAPSHOT fixed this for me. On Thu, Aug 7, 2014 at 3:58 PM, Nick Pentreath wrote: > I

Re: Failed running Spark ALS

2014-09-19 Thread Nick Pentreath
Have you set spark.local.dir (I think this is the config setting)? It needs to point to a volume with plenty of space. By default if I recall it point to /tmp Sent from my iPhone > On 19 Sep 2014, at 23:35, "jw.cmu" wrote: > > I'm trying to run Spark ALS using the netflix dataset but failed d

Re: spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-04 Thread Nick Pentreath
forgot to copy user list On Sat, Oct 4, 2014 at 3:12 PM, Nick Pentreath wrote: > what version did you put in the pom.xml? > > it does seem to be in Maven central: > http://search.maven.org/#artifactdetails%7Corg.apache.hbase%7Chbase%7C0.98.6-hadoop2%7Cpom > > > org.apa

Re: word2vec: how to save an mllib model and reload it?

2014-11-07 Thread Nick Pentreath
Currently I see the word2vec model is collected onto the master, so the model itself is not distributed.  I guess the question is why do you need  a distributed model? Is the vocab size so large that it's necessary? For model serving in general, unless the model is truly massive (ie cannot fit

Re: word2vec: how to save an mllib model and reload it?

2014-11-07 Thread Nick Pentreath
For ALS if you want real time recs (and usually this is order 10s to a few 100s ms response), then Spark is not the way to go - a serving layer like Oryx, or prediction.io is what you want. (At graphflow we've built our own). You hold the factor matrices in memory and do the dot product in

Re: word2vec: how to save an mllib model and reload it?

2014-11-07 Thread Nick Pentreath
els, and then > merge the results at the end at the single master model. > On Fri, Nov 7, 2014 at 12:20 PM, Nick Pentreath > wrote: >> Currently I see the word2vec model is collected onto the master, so the >> model itself is not distributed. >> >> I guess the questi

Re: pyspark get column family and qualifier names from hbase table

2014-11-11 Thread Nick Pentreath
Feel free to add that converter as an option in the Spark examples via a PR :) — Sent from Mailbox On Wed, Nov 12, 2014 at 3:27 AM, alaa wrote: > Hey freedafeng, I'm exactly where you are. I want the output to show the > rowkey and all column qualifiers that correspond to it. How did you write

Re: RMSE in MovieLensALS increases or stays stable as iterations increase.

2014-11-26 Thread Nick Pentreath
copying user group - I keep replying directly vs reply all :) On Wed, Nov 26, 2014 at 2:03 PM, Nick Pentreath wrote: > ALS will be guaranteed to decrease the squared error (therefore RMSE) in > each iteration, on the *training* set. > > This does not hold for the *test* set / cros

Re: locality sensitive hashing for spark

2014-12-21 Thread Nick Pentreath
Looks interesting thanks for sharing. Does it support cosine similarity ? I only saw jaccard mentioned from a quick glance. — Sent from Mailbox On Mon, Dec 22, 2014 at 4:12 AM, morr0723 wrote: > I've pushed out an implementation of locality sensitive hashing for spark. > LSH has a number of

Re: Is it possible to do incremental training using ALSModel (MLlib)?

2015-01-07 Thread Nick Pentreath
As I recall Oryx (the old version, and I assume the new one too) provide something like this: http://cloudera.github.io/oryx/apidocs/com/cloudera/oryx/als/common/OryxRecommender.html#recommendToAnonymous-java.lang.String:A-float:A-int- though Sean will be more on top of that than me :) On Mon, Ja

Re: HBase row count

2014-02-24 Thread Nick Pentreath
Yes, you''re initiating a scan for each count call. The normal way to improve this would be to use cache(), which is what you have in your commented out line: // hBaseRDD.cache() If you uncomment that line, you should see an improvement overall. If caching is not an option for some reason (maybe

Re: HBase row count

2014-02-25 Thread Nick Pentreath
t; of 'hBaseRDD.count' call. > > > > On Mon, Feb 24, 2014 at 11:29 PM, Nick Pentreath > wrote: > >> Yes, you''re initiating a scan for each count call. The normal way to >> improve this would be to use cache(), which is what you have in your >> commente

Re: HBase row count

2014-02-25 Thread Nick Pentreath
ll only be doing one pass through the data anyway (like running a count every time on the full dataset) then caching is not going to help you. On Tue, Feb 25, 2014 at 4:59 PM, Soumitra Kumar wrote: > Thanks Nick. > > How do I figure out if the RDD fits in memory? > > > On Tue, Feb 2

Re: HBase row count

2014-02-26 Thread Nick Pentreath
so one job was > taking 90% of time. > BTW, is there a way to save the details available port 4040 after job is > finished? > On Tue, Feb 25, 2014 at 7:26 AM, Nick Pentreath > wrote: >> It's tricky really since you may not know upfront how much data is in >> there. Yo

Re: Rename filter() into keep(), remove() or take() ?

2014-02-27 Thread Nick Pentreath
filter comes from the Scala collection method "filter". I'd say it's best to keep in line with the Scala collections API, as Spark has done with RDDs generally (map, flatMap, take etc), so that is is easier and natural for developers to apply the same thinking for Scala (parallel) collections to Sp

Re: Rename filter() into keep(), remove() or take() ?

2014-02-27 Thread Nick Pentreath
stand the explanation but I had to try. However, the change could be > made without breaking anything but that's another story. > Regards > Bertrand > Bertrand Dechoux > On Thu, Feb 27, 2014 at 2:05 PM, Nick Pentreath > wrote: >> filter comes from the Scala collection m

Fwd: [Scikit-learn-general] Spark+sklearn sprint outcome ?

2014-03-04 Thread Nick Pentreath
Thought that Spark users may be interested in the outcome of the Spark / scikit-learn sprint that happened last month just after Strata... -- Forwarded message -- From: Olivier Grisel Date: Fri, Feb 21, 2014 at 6:30 PM Subject: Re: [Scikit-learn-general] Spark+sklearn sprint outc

Re: Running actions in loops

2014-03-07 Thread Nick Pentreath
There is #3 which is use mapPartitions and init one jodatime obj per partition, which is less overhead for large objects— Sent from Mailbox for iPhone On Sat, Mar 8, 2014 at 2:54 AM, Mayur Rustagi wrote: > So the whole function closure you want to apply on your RDD needs to be > serializable so

Re: possible bug in Spark's ALS implementation...

2014-03-12 Thread Nick Pentreath
It would be helpful to know what parameter inputs you are using. If the regularization schemes are different (by a factor of alpha, which can often be quite high) this will mean that the same parameter settings could give very different results. A higher lambda would be required with Spark's versi

Re: Running Spark on a single machine

2014-03-16 Thread Nick Pentreath
Please follow the instructions at  http://spark.apache.org/docs/latest/index.html and  http://spark.apache.org/docs/latest/quick-start.html to get started on a local machine. — Sent from Mailbox for iPhone On Sun, Mar 16, 2014 at 11:39 PM, goi cto wrote: > Hi, > I know it is probably not th

Re: possible bug in Spark's ALS implementation...

2014-03-18 Thread Nick Pentreath
Great work Xiangrui thanks for the enhancement!— Sent from Mailbox for iPhone On Wed, Mar 19, 2014 at 12:08 AM, Xiangrui Meng wrote: > Glad to hear the speed-up. Wish we can improve the implementation > further in the future. -Xiangrui > On Tue, Mar 18, 2014 at 1:55 PM, Michael Allman wrote: >>

Re: [shark-users] SQL on Spark - Shark or SparkSQL

2014-03-30 Thread Nick Pentreath
It shouldn't be too tricky to use the Spark job server to create a job where the SQL statement is an input argument, which is executed and the result returned. This gives remote server execution but no metastore layer— Sent from Mailbox for iPhone On Mon, Mar 31, 2014 at 6:56 AM, Manoj Samel wr

Re: Calling Spahk enthusiasts in Boston

2014-03-31 Thread Nick Pentreath
I would offer to host one in Cape Town but we're almost certainly the only Spark users in the country apart from perhaps one in Johanmesburg :)— Sent from Mailbox for iPhone On Mon, Mar 31, 2014 at 8:53 PM, Nicholas Chammas wrote: > My fellow Bostonians and New Englanders, > We cannot allow New

Re: possible bug in Spark's ALS implementation...

2014-04-01 Thread Nick Pentreath
Hi Michael Would you mind setting out exactly what differences you did find between the Spark and Oryx implementations? Would be good to be clear on them, and also see if there are further tricks/enhancements from the Oryx one that can be ported (such as the lambda * numRatings adjustment). N O

Re: possible bug in Spark's ALS implementation...

2014-04-02 Thread Nick Pentreath
4. Oryx uses the weighted regularization scheme you alluded to below, >> multiplying lambda by the number of ratings. >> >> I've patched the spark impl to support (4) but haven't pushed it to my >> clone on github. I think it would be a valuable feature to suppor

NPE using saveAsTextFile

2014-04-08 Thread Nick Pentreath
Hi I'm using Spark 0.9.0. When calling saveAsTextFile on a custom hadoop inputformat (loaded with newAPIHadoopRDD), I get the following error below. If I call count, I get the correct count of number of records, so the inputformat is being read correctly... the issue only appears when trying to

Re: NPE using saveAsTextFile

2014-04-09 Thread Nick Pentreath
4:50 PM, Nick Pentreath wrote: > Hi > > I'm using Spark 0.9.0. > > When calling saveAsTextFile on a custom hadoop inputformat (loaded with > newAPIHadoopRDD), I get the following error below. > > If I call count, I get the correct count of number of records, so the >

Re: NPE using saveAsTextFile

2014-04-09 Thread Nick Pentreath
ing there? > > Matei > > On Apr 9, 2014, at 11:38 PM, Nick Pentreath > wrote: > > Anyone have a chance to look at this? > > Am I just doing something silly somewhere? > > If it makes any difference, I am using the elasticsearch-hadoop plugin for > ESInputForm

Re: NPE using saveAsTextFile

2014-04-10 Thread Nick Pentreath
There was a closure over the config object lurking around - but in any case upgrading to 1.2.0 for config did the trick as it seems to have been a bug in Typesafe config, Thanks Matei! On Thu, Apr 10, 2014 at 8:46 AM, Nick Pentreath wrote: > Ok I thought it may be closing over the con

Re: StackOverflow Error when run ALS with 100 iterations

2014-04-16 Thread Nick Pentreath
I'd also say that running for 100 iterations is a waste of resources, as ALS will typically converge pretty quickly, as in within 10-20 iterations. On Wed, Apr 16, 2014 at 3:54 AM, Xiaoli Li wrote: > Thanks a lot for your information. It really helps me. > > > On Tue, Apr 15, 2014 at 7:57 PM, C

Re: using saveAsNewAPIHadoopFile with OrcOutputFormat

2014-04-17 Thread Nick Pentreath
ES formats are pretty easy to use: Reading: val conf = new Configuration() conf.set("es.resource", "index/type") conf.set("es.query", "?q=*") val rdd = sc.newAPIHadoopRDD( conf, classOf[EsInputFormat[NullWritable, LinkedMapWritable]], classOf[NullWritable], classOf[LinkedMapWritable] ) The only g

Re: PySpark still reading only text?

2014-04-21 Thread Nick Pentreath
Also see: https://github.com/apache/spark/pull/455 This will add support for reading sequencefile and other inputformat in PySpark, as long as the Writables are either simple (primitives, maps and arrays of same), or reasonably simple Java objects. I'm about to push a change from MsgPack to

Re: User/Product Clustering with pySpark ALS

2014-04-29 Thread Nick Pentreath
There's no easy way to d this currently. The pieces are there from the PySpark code for regression which should be adaptable. But you'd have to roll your own solution. This is something I also want so I intend to put together a pull request for this soon — Sent from Mailbox On Tue, Apr 29,

spark-submit / S3

2014-05-16 Thread Nick Pentreath
Hi I see from the docs for 1.0.0 that the new "spark-submit" mechanism seems to support specifying the jar with hdfs:// or http:// Does this support S3? (It doesn't seem to as I have tried it on EC2 but doesn't seem to work): ./bin/spark-submit --master local[2] --class myclass s3n://bucket/myap

Re: Using mongo with PySpark

2014-05-19 Thread Nick Pentreath
You need to use mapPartitions (or foreachPartition) to instantiate your client in each partition as it is not serializable by the pickle library. Something like def mapper(iter): db = MongoClient()['spark_test_db'] *collec = db['programs']* *for val in iter:* asc = val.encode('

Re: Python, Spark and HBase

2014-05-20 Thread Nick Pentreath
Yes actually if you could possibly test the patch out and see how easy it is to load HBase Rdds that would be great.  That way I could make any amendments required to make HBase / Cassandra etc easier  — Sent from Mailbox On Wed, May 21, 2014 at 4:41 AM, Matei Zaharia wrote: > Unfortunately

Re: Spark on HBase vs. Spark on HDFS

2014-05-22 Thread Nick Pentreath
Hi In my opinion, running HBase for immutable data is generally overkill in particular if you are using Shark anyway to cache and analyse the data and provide the speed. HBase is designed for random-access data patterns and high throughput R/W activities. If you are only ever writing immutable lo

Re: Writing RDDs from Python Spark progrma (pyspark) to HBase

2014-05-28 Thread Nick Pentreath
It's not possible currently to write anything other than text (or pickle files I think in 1.0.0 or if not then in 1.0.1) from PySpark. I have an outstanding pull request to add READING any InputFormat from PySpark, and after that is in I will look into OutputFormat too. What does your data look l

Re: Python, Spark and HBase

2014-05-29 Thread Nick Pentreath
Hi Tommer, I'm working on updating and improving the PR, and will work on getting an HBase example working with it. Will feed back as soon as I have had the chance to work on this a bit more. N On Thu, May 29, 2014 at 3:27 AM, twizansk wrote: > The code which causes the error is: > > The code

<    1   2   3