> On May 4, 2014, 9:24 p.m., Aditya Kishore wrote:
> > exec/java-exec/src/test/resources/functions/string/testSubstr.json, line 37
> > <https://reviews.apache.org/r/21058/diff/1/?file=574172#file574172line37>
> >
> > Could you please add a test case with non-English string, for example
> > Hindi or Chinese.
The characters like Hindi are not being handled currently.
I am getting IndexOutOfBoundsException since Hindi UTF-8 takes around triple
size as compared to English text. On adding Hindi strings in test case I get
the below exception.
Any tips on solving this?
-----------------------------------------------------------------------------------------------------------------
Physical plan input ????? ?????' :
{ ref: "col12", expr: "substring('????? ?????', 3,10)"},
{ ref: "col12", expr: "substring('????? ?????', 3)"}
-----------------------------------------------------------------------------------------------------------------
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.973 sec <<<
FAILURE! - in org.apache.drill.exec.physical.impl.TestStringFunctions
testSubstr(org.apache.drill.exec.physical.impl.TestStringFunctions) Time
elapsed: 3.517 sec <<< ERROR!
java.lang.IndexOutOfBoundsException: index: 0, length: 31 (expected: range(0,
11))
at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1130)
at
io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:341)
at io.netty.buffer.AbstractByteBuf.setBytes(AbstractByteBuf.java:502)
at io.netty.buffer.SwappedByteBuf.setBytes(SwappedByteBuf.java:396)
at
org.apache.drill.exec.vector.ValueHolderHelper.getVarCharHolder(ValueHolderHelper.java:49)
at
org.apache.drill.exec.test.generated.ProjectorGen0.doSetup(ProjectorTemplate.java:997)
at
org.apache.drill.exec.test.generated.ProjectorGen0.setup(ProjectorTemplate.java:90)
at
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:175)
at
org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:53)
at
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:111)
at
org.apache.drill.exec.physical.impl.SimpleRootExec.next(SimpleRootExec.java:71)
at
org.apache.drill.exec.physical.impl.TestStringFunctions.runTest(TestStringFunctions.java:99)
at
org.apache.drill.exec.physical.impl.TestStringFunctions.testSubstr(TestStringFunctions.java:204)
-----------------------------------------------------------------------------------------------------------------
- Yash
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21058/#review42105
-----------------------------------------------------------
On May 5, 2014, 10:40 a.m., Yash Sharma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21058/
> -----------------------------------------------------------
>
> (Updated May 5, 2014, 10:40 a.m.)
>
>
> Review request for drill, Aditya Kishore, Jacques Nadeau, Jinfeng Ni, and
> Mehant Baid.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Adding substr(expression, start) to improve string substring function.
> This is also a bug fix for https://issues.apache.org/jira/browse/DRILL-542.
>
>
> Diffs
> -----
>
>
> exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java
> aca5933
>
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestStringFunctions.java
> 09d1361
> exec/java-exec/src/test/resources/functions/string/testSubstr.json e885381
>
> Diff: https://reviews.apache.org/r/21058/diff/
>
>
> Testing
> -------
>
> Yes.
> ----------------------------------------------------------------------------------------
> JUnit Test Case:
> ----------------------------------------------------------------------------------------
>
> $mvn test -Dtest=TestStringFunctions#testSubstr
>
> Results :
>
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
>
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Total time: 53.030 s
> [INFO] Finished at: 2014-05-04T16:08:26+05:30
> [INFO] Final Memory: 44M/711M
> [INFO]
> ------------------------------------------------------------------------
>
>
> ----------------------------------------------------------------------------------------
> SQLLINE Test
> ----------------------------------------------------------------------------------------
>
> 0: jdbc:drill:zk=local> SELECT employee_id, first_name, substring(first_name,
> 3) sub_str FROM cp.`employee.json` limit 20;
> +-------------+------------+------------+
> | employee_id | first_name | sub_str |
> +-------------+------------+------------+
> | 1 | Sheri | eri |
> | 2 | Derrick | rrick |
> | 4 | Michael | chael |
> | 5 | Maya | ya |
> | 6 | Roberta | berta |
> | 7 | Rebecca | becca |
> | 8 | Kim | m |
> | 9 | Brenda | enda |
> | 10 | Darren | rren |
> | 11 | Jonathan | nathan |
> | 12 | Jewel | wel |
> | 13 | Peggy | ggy |
> | 14 | Bryan | yan |
> | 15 | Walter | lter |
> | 16 | Peggy | ggy |
> | 17 | Brenda | enda |
> | 18 | Daniel | niel |
> | 19 | Dianne | anne |
> | 20 | Beverly | verly |
> | 21 | Pedro | dro |
> +-------------+------------+------------+
>
>
> Thanks,
>
> Yash Sharma
>
>