I'm using hadoop streaming and currently have these properties in my command line: -Dstream.map.output.field.separator=' ' \ -Dstream.num.map.output.key.fields=1 \
This works for me as my test data happens to have a space at column 14. If I want to use a fixed length split, is there a simple cut function I could use like undefining the separator and counting 13 bytes? -Dstream.map.output.field.separator= \ -Dstream.num.map.output.key.fields=13 \ I have searched the forum for discussions on fixed length or splitting keys but have not found my answer. Perhaps this is not possible, at least on the command line? Thanks, Kevin