Re: sc.textFile can't recognize '\004'

2014-06-21 Thread anny9699
Thanks a lot Sean! It works now for me now~~ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sc-textFile-can-t-recognize-004-tp8059p8071.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: sc.textFile can't recognize '\004'

2014-06-20 Thread Sean Owen
These are actually Scala / Java questions. On Sat, Jun 21, 2014 at 1:08 AM, anny9699 wrote: > 1) One of the separators is '\004', which could be recognized by python or R > or Hive, however Spark seems can't recognize this one and returns a symbol > looking like '?'. Also this symbol is not a que

sc.textFile can't recognize '\004'

2014-06-20 Thread anny9699
Hi, I need to parse a file which is separated by a series of separators. I used SparkContext.textFile and I met two problems: 1) One of the separators is '\004', which could be recognized by python or R or Hive, however Spark seems can't recognize this one and returns a symbol looking like '?'. A