date:20160717

Re: How to run Zeppelin and Spark Thrift Server Together

2016-07-17 Thread Chanh Le

Hi Ayan, I succeed with Tableau but I still can’t import metadata from Hive to Oracle BI. Is that Oracle BI still can’t connect to STS. Regards, Chanh > On Jul 15, 2016, at 11:44 AM, ayan guha wrote: > > Its possible that transfar protocols are not matching, thats

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

2016-07-17 Thread Yanbo Liang

Hi Tobi, Thanks for clarifying the question. It's very straight forward to convert the filtered RDD to DataFrame, you can refer the following code snippets: from pyspark.sql import Row rdd2 = filteredRDD.map(lambda v: Row(features=v)) df = rdd2.toDF() Thanks Yanbo 2016-07-16 14:51 GMT-07:00

Re: scala.MatchError on stand-alone cluster mode

2016-07-17 Thread Mekal Zheng

Hi, Rishabh Bhardwaj, Saisai Shao, Thx for your help. I hava found that the key reason is I forgot to upload the jar package to all of the node in cluster, so after the master distributed the job and selected one node as the driver, the driver can not find the jar package and throw an exception.

Re: Spark streaming takes longer time to read json into dataframes

2016-07-17 Thread Diwakar Dhanuskodi

Hi, Repartition would create shuffle over network which I should avoid to reduce processing time because the size of messages at most in a batch will be 5G. Partitioning topic and parallelize receiving in Direct Stream might do the trick. Sent from Samsung Mobile.

Re: Dataframe Transformation with Inner fields in Complex Datatypes.

2016-07-17 Thread ayan guha

Hi withColumn adds the column. If you want different name, please use .alias() function. On Mon, Jul 18, 2016 at 2:16 AM, java bigdata wrote: > Hi Team, > > I am facing a major issue while transforming dataframe containing complex > datatype columns. I need to update the

Re: How to recommend most similar users using Spark ML

2016-07-17 Thread Karl Higley

There are also some Spark packages for finding approximate nearest neighbors using locality sensitive hashing: https://spark-packages.org/?q=tags%3Alsh On Fri, Jul 15, 2016 at 7:45 AM nguyen duc Tuan wrote: > Hi jeremycod, > If you want to find top N nearest neighbors for

How to use Spark scala custom UDF in spark sql CLI or beeline client

2016-07-17 Thread pooja mehta

Hi, How to Use Spark scala custom UDF in spark sql CLI or Beeline client. with sqlContext we can register a UDF like this: sqlContext.udf.register("sample_fn", sample_fn _ ) What is the way to use UDF in Spark sql CLI or beeline client. Thanks Pooja

Dataframe Transformation with Inner fields in Complex Datatypes.

2016-07-17 Thread java bigdata

Hi Team, I am facing a major issue while transforming dataframe containing complex datatype columns. I need to update the inner fields of complex datatype, for eg: converting one inner field to UPPERCASE letters, and return the same dataframe with new upper case values in it. Below is my issue

unsubscribe

2016-07-17 Thread Burger, Robert

Robert Burger | Solutions Design IT Specialist | CBAW TS | TD Wealth Technology Solutions 79 Wellington Street West, 17th Floor, TD South Tower, Toronto, ON, M5K 1A2 If you wish to unsubscribe from receiving commercial electronic messages from TD Bank Group, please click here or go to the

Re: Spark (on Windows) not picking up HADOOP_CONF_DIR

2016-07-17 Thread Jacek Laskowski

Hi, How did you set it? How do you run the app? Use sys.env to know whether it was set or not. Jacek On 17 Jul 2016 11:33 a.m., "Daniel Haviv" wrote: > Hi, > I'm running Spark using IntelliJ on Windows and even though I set > HADOOP_CONF_DIR it does not affect

Re: How can we control CPU and Memory per Spark job operation..

2016-07-17 Thread Jacek Laskowski

Hi, How would that help?! Why would you do that? Jacek On 17 Jul 2016 7:19 a.m., "Pedro Rodriguez" wrote: > You could call map on an RDD which has “many” partitions, then call > repartition/coalesce to drastically reduce the number of partitions so that > your second

Spark (on Windows) not picking up HADOOP_CONF_DIR

2016-07-17 Thread Daniel Haviv

Hi, I'm running Spark using IntelliJ on Windows and even though I set HADOOP_CONF_DIR it does not affect the contents of sc.hadoopConfiguration. Anybody encountered it ? Thanks, Daniel

Re: How to run Zeppelin and Spark Thrift Server Together

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

Re: scala.MatchError on stand-alone cluster mode

Re: Spark streaming takes longer time to read json into dataframes

Re: Dataframe Transformation with Inner fields in Complex Datatypes.

Re: How to recommend most similar users using Spark ML

How to use Spark scala custom UDF in spark sql CLI or beeline client

Dataframe Transformation with Inner fields in Complex Datatypes.

unsubscribe

Re: Spark (on Windows) not picking up HADOOP_CONF_DIR

Re: How can we control CPU and Memory per Spark job operation..

Spark (on Windows) not picking up HADOOP_CONF_DIR

12 matches

Site Navigation

Mail list logo

Footer information