[PySpark join] Resolved attribute(s) missing from... Attribute(s) with the same name appear in the operation

2018-10-05 Thread Buckler, Christine
Does anyone know what would cause this type of error? I can’t find anything wrong with the dataframes or the join method. Dataframe columns: neighbors_for_one.columns['style_id', 'colorcode', 'asset_id', 'distance'] apparel_seeds.columns['style_number',

Re: Recreate Dataset from list of Row in spark streaming application.

2018-10-05 Thread Shixiong(Ryan) Zhu
oracle.insight.spark.identifier_processor.ForeachFederatedEventProcessor is a ForeachWriter. Right? You can not use SparkSession in its process method as it will run in executors. Best Regards, Ryan On Fri, Oct 5, 2018 at 6:54 AM Kuttaiah Robin wrote: > Hello, > > I have a spark streaming

Re: Specifying different version of pyspark.zip and py4j files on worker nodes with Spark pre-installed

2018-10-05 Thread Marcelo Vanzin
Sorry, I can't help you if that doesn't work. Your YARN RM really should not have SPARK_HOME set if you want to use more than one Spark version. On Thu, Oct 4, 2018 at 9:54 PM Jianshi Huang wrote: > > Hi Marcelo, > > I see what you mean. Tried it but still got same error message. > >> Error from

Unsubscribe

2018-10-05 Thread Donni Khan

Recreate Dataset from list of Row in spark streaming application.

2018-10-05 Thread Kuttaiah Robin
Hello, I have a spark streaming application which reads from Kafka based on the given schema. Dataset m_oKafkaEvents = getSparkSession().readStream().format("kafka") .option("kafka.bootstrap.servers", strKafkaAddress) .option("assign", strSubscription)