Dapeng Sun created SPARK-21661: ---------------------------------- Summary: SparkSQL can't merge load table from Hadoop Key: SPARK-21661 URL: https://issues.apache.org/jira/browse/SPARK-21661 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.2.0 Reporter: Dapeng Sun
Here is the original text of external table on HDFS: {noformat} Permission Owner Group Size Last Modified Replication Block Size Name -rw-r--r-- root supergroup 0 B 8/6/2017, 11:43:03 PM 3 256 MB income_band_001.dat -rw-r--r-- root supergroup 0 B 8/6/2017, 11:39:31 PM 3 256 MB income_band_002.dat ... -rw-r--r-- root supergroup 327 B 8/6/2017, 11:44:47 PM 3 256 MB income_band_530.dat {noformat} After SparkSQL load, each files have a output, even the files are 0B. For the load on Hive, the data files would be merged according the data size of original files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org