Re: Spark ml.ALS question -- RegressionEvaluator .evaluate giving ~1.5 output for same train and predict data

2016-07-24 Thread VG
ping. Anyone has some suggestions/advice for me . It will be really helpful. VG On Sun, Jul 24, 2016 at 12:19 AM, VG <vlin...@gmail.com> wrote: > Sean, > > I did this just to test the model. When I do a split of my data as > training to 80% and test to be 20% > > I get

Re: Spark ml.ALS question -- RegressionEvaluator .evaluate giving ~1.5 output for same train and predict data

2016-07-23 Thread VG
Any suggestions / ideas here ? On Sun, Jul 24, 2016 at 12:19 AM, VG <vlin...@gmail.com> wrote: > Sean, > > I did this just to test the model. When I do a split of my data as > training to 80% and test to be 20% > > I get a Root-mean-square error = NaN > > S

Re: Spark ml.ALS question -- RegressionEvaluator .evaluate giving ~1.5 output for same train and predict data

2016-07-23 Thread VG
Sean, I did this just to test the model. When I do a split of my data as training to 80% and test to be 20% I get a Root-mean-square error = NaN So I am wondering where I might be going wrong Regards, VG On Sun, Jul 24, 2016 at 12:12 AM, Sean Owen <so...@cloudera.com> wrote: > N

Re: Error in collecting RDD as a Map - IOException in collectAsMap

2016-07-23 Thread VG
Hi Pedro, Based on your suggestion, I deployed this on a aws node and it worked fine. thanks for your advice. I am still trying to figure out the issues on the local environment Anyways thanks again -VG On Sat, Jul 23, 2016 at 9:26 PM, Pedro Rodriguez <ski.rodrig...@gmail.com> wrote:

Spark ml.ALS question -- RegressionEvaluator .evaluate giving ~1.5 output for same train and predict data

2016-07-23 Thread VG
I am trying to run ml.ALS to compute some recommendations. Just to test I am using the same dataset for training using ALSModel and for predicting the results based on the model . When I evaluate the result using RegressionEvaluator I get a Root-mean-square error = 1.5544064263236066 I thin

Re: Error in collecting RDD as a Map - IOException in collectAsMap

2016-07-23 Thread VG
t; — > Pedro Rodriguez > PhD Student in Large-Scale Machine Learning | CU Boulder > Systems Oriented Data Scientist > UC Berkeley AMPLab Alumni > > pedrorodriguez.io | 909-353-4423 > github.com/EntilZha | LinkedIn > <https://www.linkedin.com/in/pedrorodriguezscience>

Error in collecting RDD as a Map - IOException in collectAsMap

2016-07-23 Thread VG
Please suggest if I am doing something wrong or an alternative way of doing this. I have an RDD with two values as follows JavaPairRDD rdd When I execute rdd..collectAsMap() it always fails with IO exceptions. 16/07/23 19:03:58 ERROR RetryingBlockFetcher: Exception while

How to search on a Dataset / RDD <Row, Long >

2016-07-22 Thread VG
Any suggestions here please I basically need an ability to look up *name -> index* and *index -> name* in the code -VG On Fri, Jul 22, 2016 at 6:40 PM, VG <vlin...@gmail.com> wrote: > Hi All, > > I am really confused how to proceed further. Please help. > > I have a

Re: Error in running JavaALSExample example from spark examples

2016-07-22 Thread VG
Great. thanks a ton for helping out on this Sean. I somehow messed this up (and was running in loops for last 2 hours ) thanks again -VG On Fri, Jul 22, 2016 at 11:28 PM, Sean Owen <so...@cloudera.com> wrote: > You mark these provided, which is correct. If the version of Scala &

Re: Error in running JavaALSExample example from spark examples

2016-07-22 Thread VG
at > runtime then. Sounds like it was built for Scala 2.10 > > On Fri, Jul 22, 2016 at 6:43 PM, VG <vlin...@gmail.com> wrote: > > Using 2.0.0-preview using maven > > So all dependencies should be correct I guess > > > > > > org.apache.spark > > sp

Re: Error in running JavaALSExample example from spark examples

2016-07-22 Thread VG
Using 2.0.0-preview using maven So all dependencies should be correct I guess org.apache.spark spark-core_2.11 2.0.0-preview provided I see in maven dependencies that this brings in scala-reflect-2.11.4 scala-compiler-2.11.0 and so on On Fri, Jul 22, 2016 at 11:04 PM, Aaron Ilovici

Error in running JavaALSExample example from spark examples

2016-07-22 Thread VG
to resolve this VG

Re: Dataset , RDD zipWithIndex -- How to use as a map .

2016-07-22 Thread VG
Hi All, Any suggestions for this Regards, VG On Fri, Jul 22, 2016 at 6:40 PM, VG <vlin...@gmail.com> wrote: > Hi All, > > I am really confused how to proceed further. Please help. > > I have a dataset created as follows: > Dataset b = sqlContext.sql("SELECT bid, na

Re: ml ALS.fit(..) issue

2016-07-22 Thread VG
Can someone please help here. I tried both scala 2.10 and 2.11 on the system On Fri, Jul 22, 2016 at 7:59 PM, VG <vlin...@gmail.com> wrote: > I am using version 2.0.0-preview > > > > On Fri, Jul 22, 2016 at 7:47 PM, VG <vlin...@gmail.com> wrote: > >> I am r

Re: ml ALS.fit(..) issue

2016-07-22 Thread VG
I am using version 2.0.0-preview On Fri, Jul 22, 2016 at 7:47 PM, VG <vlin...@gmail.com> wrote: > I am running into the following error when running ALS > > Exception in thread "main" java.lang.NoSuchMethodError: > scala.reflect.api.JavaUniverse.runtimeMirror(Lj

ml ALS.fit(..) issue

2016-07-22 Thread VG
.scala:452) at yelp.TestUser.main(TestUser.java:101) here line 101 in the above error is the following in code. ALSModel model = als.fit(training); Does anyone has a suggestion what is going on here and where I might be going wrong ? Please suggest -VG

Dataset , RDD zipWithIndex -- How to use as a map .

2016-07-22 Thread VG
o do this please suggest that. Regards VG

Re: MLlib, Java, and DataFrame

2016-07-22 Thread VG
Interesting. thanks for this information. On Fri, Jul 22, 2016 at 11:26 AM, Bryan Cutler <cutl...@gmail.com> wrote: > ML has a DataFrame based API, while MLlib is RDDs and will be deprecated > as of Spark 2.0. > > On Thu, Jul 21, 2016 at 10:41 PM, VG <vlin...@gmail.com&g

Re: MLlib, Java, and DataFrame

2016-07-21 Thread VG
Why do we have these 2 packages ... ml and mlib? What is the difference in these On Fri, Jul 22, 2016 at 11:09 AM, Bryan Cutler wrote: > Hi JG, > > If you didn't know this, Spark MLlib has 2 APIs, one of which uses > DataFrames. Take a look at this example >

Re: spark-xml - xml parsing when rows only have attributes

2016-06-17 Thread VG
Great.. thanks for pointing this out. On Fri, Jun 17, 2016 at 6:21 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Please see https://github.com/databricks/spark-xml/issues/92 > > On Fri, Jun 17, 2016 at 5:19 AM, VG <vlin...@gmail.com> wrote: > >> I am using spark-xml

spark-xml - xml parsing when rows only have attributes

2016-06-17 Thread VG
the data Any suggestions to fix this On Fri, Jun 17, 2016 at 4:28 PM, Siva A <siva9940261...@gmail.com> wrote: > Use Spark XML version,0.3.3 > > com.databricks > spark-xml_2.10 > 0.3.3 > > > On Fri, Jun 17, 2016 at 4:25 PM, VG <vlin...@gmail.com> wrote: &g

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
It proceeded with the jars I mentioned. However no data getting loaded into data frame... sob sob :( On Fri, Jun 17, 2016 at 4:25 PM, VG <vlin...@gmail.com> wrote: > Hi Siva > > This is what i have for jars. Did you manage to run with these or > different versions ? > &g

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
Hi Siva This is what i have for jars. Did you manage to run with these or different versions ? org.apache.spark spark-core_2.10 1.6.1 org.apache.spark spark-sql_2.10 1.6.1 com.databricks spark-xml_2.10 0.2.0 org.scala-lang scala-library 2.10.6 Thanks VG On Fri, Jun 17, 2016 at 4:16

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
jar and run using spark-submit >> >> Siva >> >> On Fri, Jun 17, 2016 at 3:17 PM, VG <vlin...@gmail.com> wrote: >> >>> I am trying to run from IDE and everything else is working fine. >>> I added spark-xml jar and now I ended up into this dependency >

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
wrote: > So you are using spark-submit or spark-shell? > > you will need to launch either by passing --packages option (like in the > example below for spark-csv). you will need to iknow > > --packages com.databricks:spark-xml_: > > hth > > > > On Fri, Ju

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
uot;xml") > .option("rowTag", "row") > .load("A.xml"); > > FYR: https://github.com/databricks/spark-xml > > --Siva > > On Fri, Jun 17, 2016 at 2:50 PM, VG <vlin...@gmail.com> wrote: > >> Apologies for th

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
mistr...@gmail.com> wrote: > too little info > it'll help if you can post the exception and show your sbt file (if you > are using sbt), and provide minimal details on what you are doing > kr > > On Fri, Jun 17, 2016 at 10:08 AM, VG <vlin...@gmail.com> wrote: > >> Failed to find data source: com.databricks.spark.xml >> >> Any suggestions to resolve this >> >> >> >

java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
Failed to find data source: com.databricks.spark.xml Any suggestions to resolve this

Re: ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks

2016-06-15 Thread VG
Any suggestions on this please On Wed, Jun 15, 2016 at 10:42 PM, VG <vlin...@gmail.com> wrote: > I have a very simple driver which loads a textFile and filters a >> sub-string from each line in the textfile. >> When the collect action is executed , I am getting an exce

Fwd: ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks

2016-06-15 Thread VG
> > I have a very simple driver which loads a textFile and filters a > sub-string from each line in the textfile. > When the collect action is executed , I am getting an exception. (The > file is only 90 MB - so I am confused what is going on..) I am running on a > local standalone cluster > >