1.6.x I think this may work with spark-csv
> <https://github.com/databricks/spark-csv> :
>
> spark.read.format("com.databricks.spark.csv").option("header", "false")
> .schema(custom_schema)
> .option('delimiter', '\t')
> .op
elimiter', '\t')
.option('mode', 'DROPMALFORMED')
.load(paths.split(','))
However, even it mentions that this approach would work in Spark 2.x, I don’t
find an implementation of load that accepts an Array[String] as an input
parameter.
Thanks in advance for your help.
Did
Time_inc#200, Health#1014, Inf_period#1039,
> infectedFamily#1355L, infectedWorker#1385L]
>
> +- Aggregate [S_ID#1903L], [S_ID#1903L, count(1) AS infectedStreet#1415L]
>
> Does someone have a clue about it?
> Thanks,
>
>
>
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (0)730229737
Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP
t to console? When I run my standalone test Kafka consumer
> jar I can see that it is receiving messages. so I am not sure what is going
> on with above code? any ideas?
>
> Thanks!
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (0)730229737
Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP
;
> I want to get the result as follow
> user_id1 feature1 feature2 feature3 feature4 feature5...feature100
>
> Is there a more efficient way except join?
>
> Thanks!
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (
> user_id1 feature1 feature2 feature3 feature4 feature5...feature100
>
> Is there a more efficient way except join?
>
> Thanks!
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (0)730229737
Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP
playing around with
> coalesce in a sql expression, but I'm not having any luck here either.
>
> Obviously, I can do a null check on the fields downstream, however it is not
> in the spirit of scala to pass around nulls, so I wanted to see if I was
> missing another approach first.
>
&
Spark can be a consumer and a producer from the Kafka point of view.
You can create a kafka client in Spark that registers to a topic and reads the
feeds, and you can process data in Spark and generate a producer that sends
that data into a topic.
So, Spark lies next to Kafka and you can use
Is 1570 the value of Col1?
If so, you have ordered by that column and selected only the first item. It
seems that both results have the same Col1 value, therefore any of them would
be a right answer to return. Right?
> On 2 Feb 2017, at 11:03, Alex wrote:
>
> Hi As shown
Are you sure that “age” is a numeric field?
Even numeric, you could pass the “44” between quotes:
INSERT into your_table ("user","age","state") VALUES ('user3’,’44','CT’)
Are you sure there are no more fields that are specified as NOT NULL, and that
you did not provide a value (besides user,
Any suggestions for using something like OneHotEncoder and StringIndexer on
an InputDStream?
I could try to combine an Indexer based on a static parquet but I want to
use the OneHotEncoder approach in Streaming data coming from a socket.
Thanks!
Dídac Gil de la Iglesia
11 matches
Mail list logo