Isnt it a space separated data? It is not a comma(,) separated nor pipe (|) separated data.
Thanks Best Regards On Mon, Aug 10, 2015 at 12:06 PM, Netwaver <wanglong_...@163.com> wrote: > Hi Spark experts, > I am now using Spark 1.4.1 and trying Spark SQL/DataFrame > API with text file in below format > id gender height > 1 M 180 > 2 F 167 > ... ... > But I meet issues as described below: > 1. In my test program, I specify the schema > programmatically, but when I use "|" as the separator in schema string, the > code run into below exception when being executed on the cluster(Standalone) > > When I use "," as the separator, everything works fine. > 2. In the code, when I use DataFrame.agg() function > with same column name is used for different statistics > functions(max,min,avg) > val peopleDF = sqlCtx.createDataFrame(rowRDD, > schema) > peopleDF.filter(peopleDF("gender").equalTo("M" > )).agg(Map("height" -> "avg","height" -> "max","height" -> "min")).show() > > I just find only the last function's computation > result is shown(as below), Does this work as design in Spark? > > Hopefully I have described the "issue" clearly, and > please feel free to correct me if have done something wrong, thanks a lot. > > >