Mapping words to vector sparkml CountVectorizerModel

2017-12-18 Thread Sandeep Nemuri
],[0.6095235999680518,0.9946971867717818,0.5151611294911758,0.4371112749198506,3.4968901993588046,0.06806241719930584,1.1156025996012633,3.0425756717399217,0.3760235829400124])* Wanted to get top n words which are mapped with this ranking. Any pointers on how to achieve this? -- * Regards* * Sandeep Nemuri*

Re: SparkSession via HS2 - Error -spark.yarn.jars not read

2017-07-05 Thread Sandeep Nemuri
doop.hive.ql.exec.Operator.forward( > Operator.java:841) > > at org.apache.hadoop.hive.ql.exec.SelectOperator.process( > SelectOperator.java:88) > > at org.apache.hadoop.hive.ql.exec.Operator.forward( > Operator.java:841) > > at org.apache.hadoop.hive.ql.exec.TableScanOperator. > process(TableScanOperator.java:133) > > at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx. > forward(MapOperator.java:170) > > at org.apache.hadoop.hive.ql.exec.MapOperator.process( > MapOperator.java:555) > > ... 18 more > > > > > > > > > -- * Regards* * Sandeep Nemuri*

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread Sandeep Nemuri
nd currently only supports Scala 2.10 > builds of Spark. To run Livy with local sessions, first export these > variables:" > > I am using spark 2.1.1 and scala 2.11.8 and I would like to use Dataframes > and Dataset API so it sounds like this is not an option for me? > > T

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread Sandeep Nemuri
er. > Any suggestions? > > Thanks! > -- * Regards* * Sandeep Nemuri*

Re: spark-submit config via file

2017-03-27 Thread Sandeep Nemuri
ssorImpl.invoke0(Native Method) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >> ssorImpl.java:62) >> >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >> thodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:498) >> >> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy >> $SparkSubmit$$runMain(SparkSubmit.scala:745) >> >> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit >> .scala:187) >> >> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit. >> scala:212) >> >> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala: >> 126) >> >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> 17/03/24 11:36:27 INFO MetricsSystemImpl: Stopping azure-file-system >> metrics system... >> >> Anyone know is this is even possible ? >> >> >> Thanks... >> >> Roy >> > > -- * Regards* * Sandeep Nemuri*

Re: Spark Streaming Job Keeps growing memory over time

2016-08-09 Thread Sandeep Nemuri
nable to create multiple executor within a worker.and also in >> > spark_env.sh file , setting any configuration related to executor comes >> > under YARN only mode. >> >> I have also tried running example program but same problem. >> >> Any help would be greatly appreciated, >> >> Thanks >> >> >> >> >> -- >> View this message in context: http://apache-spark-user-list. >> 1001560.n3.nabble.com/Spark-Streaming-Job-Keeps-growing-memo >> ry-over-time-tp27498.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> > -- * Regards* * Sandeep Nemuri*

Re: Stop Spark Streaming Jobs

2016-08-04 Thread Sandeep Nemuri
Also set spark.streaming.stopGracefullyOnShutdown to true If true, Spark shuts down the StreamingContext gracefully on JVM shutdown rather than immediately. http://spark.apache.org/docs/latest/configuration.html#spark-streaming ᐧ On Thu, Aug 4, 2016 at 12:31 PM, Sandeep Nemuri <nhsa

Re: Stop Spark Streaming Jobs

2016-08-04 Thread Sandeep Nemuri
and pushed >> to >> > background with nohup. >> > >> > What are the recommended ways to stop job either on yarn-client or >> cluster >> > mode. >> > >> > Thanks, >> > Pradeep >> > >> > - >> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> > >> > >> > >> > >> > - >> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> > >> >> >> - >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> > -- * Regards* * Sandeep Nemuri*

Re: data frame or RDD for machine learning

2016-06-09 Thread Sandeep Nemuri
Please refer : http://spark.apache.org/docs/latest/mllib-guide.html ~Sandeep On Thursday 9 June 2016, Jacek Laskowski wrote: > Hi, > > Use DataFrame-based API (aka spark.ml) first and if your ml algorithm > doesn't support it switch to a RDD-based API (spark.mllib). What

Re: yarn-cluster mode error

2016-05-17 Thread Sandeep Nemuri
Anyone can suggest why i am getting this error message? > > Thanks > Raj > > > > > Sent from Yahoo Mail. Get the app <https://yho.com/148vdq> > -- * Regards* * Sandeep Nemuri*

Re: parquet table in spark-sql

2016-05-03 Thread Sandeep Nemuri
> > "Experience is what you get when you didn't get what you wanted" >-By Prof. Randy Pausch in "The Last Lecture" > > My Journal :- http://varadharajan.in > -- * Regards* * Sandeep Nemuri*