zenfenan created NIFI-4826:
------------------------------

             Summary: ListAzureBlobStorage doesn't write azure.blobname properly
                 Key: NIFI-4826
                 URL: https://issues.apache.org/jira/browse/NIFI-4826
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Extensions
    Affects Versions: 1.5.0, 1.4.0, 1.3.0, 1.2.0
            Reporter: zenfenan


ListAzureBlobStorage as of now takes the substring from the blob's primary URI 
i.e. primaryUri.lastIndexOf('/') + 1 and writes that as azure.blobname. For ex, 
if the blob is in the path 
"mystorageaccountname.blob.core.windows.net/container-name/path/to/the/blob". 
It will write azure.blobname as "blob". So if we have the blob located under a 
multiple hierarchy directory structure such as the above one, it will be 
troublesome in the downstream processors like FetchAzureBlobStorage which 
expects the full blob name to be given i.e. "path/to/the/blob". Giving just 
"blob" here will fail.

A workaround that can be followed right now, is to use "ExecuteScript" and get 
the substring from primary URI i.e. everything after the 
"https://"+storageAccountName+"/"+containerName+"/";. A better approach would be 
to make use of the CloudBlob.getName() API provided in Azure SDK. It should be 
a minor change since we are already using this SDK and the class in our 
processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to