For the first one,
input.map { case(x,l) => (x, l.reduce(_ + _) ) }
will do what you need.
For the second, yes, there's a difference, one is a List the other is a
Tuple. See for instance
See for instance
val a = (1,2,3)
a.getClass.getName
res4: String = scala.Tuple3
You should look up tuples
I came to a similar solution to a similar problem. I deal with a lot of CSV
files from many different sources and they are often malformed.
HOwever, I just have success/failure. Maybe you should make
SuccessWithWarnings a subclass of success, or getting rid of it altogether
making the warnings
If HDFS is on a linux VM, you could also mount it with FUSE and export it
with samba
2015-08-29 2:26 GMT-07:00 Ted Yu yuzhih...@gmail.com:
See
https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html
FYI
On Sat, Aug 29, 2015 at 1:04 AM, Akhil Das
It depends, if HDFS is running under windows, FUSE won't work, but if HDFS
is on a linux VM, Box, or cluster, then you can have the linux box/vm mount
HDFS through FUSE and at the same time export its mount point on samba. At
that point, your windows machine can just connect to the samba share.
R.
August 2015 at 20:43, Roberto Congiu roberto.con...@gmail.com
wrote:
When you launch your HDP guest VM, most likely it gets launched with NAT
and
an address on a private network (192.168.x.x) so on your windows host you
should use that address (you can find out using ifconfig on the guest
OS
.
I can't imagine I'm the only person on the planet wanting to do this.
Anyway, thanks for trying to help.
Dino.
On 25 August 2015 at 08:22, Roberto Congiu roberto.con...@gmail.com
wrote:
Port 8020 is not the only port you need tunnelled for HDFS to work. If
you
only list
When you launch your HDP guest VM, most likely it gets launched with NAT
and an address on a private network (192.168.x.x) so on your windows host
you should use that address (you can find out using ifconfig on the guest
OS).
I usually add an entry to my /etc/hosts for VMs that I use oftenif
2015-08-21 3:17 GMT-07:00 smagadi sudhindramag...@fico.com:
teenagers .toJSON gives the json but it does not preserve the parent ids
meaning if the input was {name:Yin,
address:{city:Columbus,state:Ohio},age:20}
val x= sqlContext.sql(SELECT name, address.city, address.state ,age FROM
I wrote a brief howto on building nested records in spark and storing them
in parquet here:
http://www.congiu.com/creating-nested-data-parquet-in-spark-sql/
2015-06-23 16:12 GMT-07:00 Richard Catlin richard.m.cat...@gmail.com:
How do I create a DataFrame(SchemaRDD) with a nested array of Rows