; >> So the question is why is Spark, particularly, 2.1.0, only generate
> min/max
> >> for numeric columns, but not strings(BINARY) fields, even if the string
> >> field is included in the sort? Maybe I missed a configuraiton?
> >>
> >> The second issue,
/secret/spark21-sortById` where
>> id=4").show
>> I got many lines like this:
>> 17/01/17 09:23:35 INFO FilterCompat: Filtering using predicate:
>> and(noteq(id, null), eq(id, 4))
>> 17/01/17 09:23:35 INFO FileScanRDD: Reading File path:
>> file:///secret
7ac12-6038-46ee-b5c3-d7a5a06e4425.snappy.parquet,
> range: 0-558, partition values: [empty row]
> ...
> 17/01/17 09:23:35 INFO FilterCompat: Filtering using predicate:
> and(noteq(id, null), eq(id, 4))
> 17/01/17 09:23:35 INFO FileScanRDD: Reading File path:
te:
> and(noteq(id, null), eq(id, 4))
> 17/01/17 09:23:35 INFO FileScanRDD: Reading File path:
> file:///secret/spark21-sortById/part-00193-39f7ac12-6038-46ee-b5c3-d7a5a06e4425.snappy.parquet,
> range: 0-574, partition values: [empty row]
> ...
>
>
9:23:35 INFO FilterCompat: Filtering using predicate:
and(noteq(id, null), eq(id, 4))
17/01/17 09:23:35 INFO FileScanRDD: Reading File path:
file:///secret/spark21-sortById/part-00193-39f7ac12-6038-46ee-b5c3-d7a5a06e4425.snappy.parquet,
range: 0-574, partition values: [empty row]
...
The q