Re: wild cards in spark sql

2015-09-02 Thread Anas Sherwani
Yes, SparkSQL does support wildcards. The query you have written should work as is, if the type of ename is string. You can find all the keywords and a few supported functions at http://docs.datastax.com/en/datastax_enterprise/4.6/datastax_enterprise/spark/sparkSqlSupportedSyntax.html

Re: Spark - Eclipse IDE - Maven

2015-07-24 Thread Anas Sherwani
Can you explain the issue? Further, in which language do you want to code? There are number of blogs to create a simple a maven project in Eclipse, and they are pretty simple and straightforward. -- View this message in context:

Re: Spark MLlib instead of Mahout - collaborative filtering model

2015-07-21 Thread Anas Sherwani
I have never used Mahout, so cannot compare the two. Spark MLlib, however, provides matrix factorization based Collaborative Filtering http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html using Alternating Least Squares algorithm. Also, Singular Value Decomposition is handled

Re: Random Forest Error

2015-07-15 Thread Anas Sherwani
For RandomForest classifier, labels should be within the range [0,numClasses-1]. This means, you have to map your labels to 0,1 instead of 1,2. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Random-Forest-Error-tp23847p23848.html Sent from the Apache Spark

Spark-CSV: Multiple delimiters and Null fields support

2015-07-06 Thread Anas Sherwani
Hi all, Apparently, we can only specify character delimiter for tokenizing data using Spark-CSV. But what if we have a log file with multiple delimiters or even a multi-character delimiter? e.g. (field1,field2:field3) with delimiters [,:] and (field1::field2::field3) with a single multi-character