Hello Patrick,

thanks for the READ() script function! This is sure a useful extension of the 
script toolset.

However, I'd like to mention in this context that the engine for a long time 
contains a mechanism for the general problem with large fields.

Some data (usually opaque binary data like the PHOTO or email attachments, but 
also possibly very large text fields like NOTE) should be loaded on-demand 
only, and not with with the syncset and the fields that are needed for ID and 
content matching.

So string fields can have a proxy object (a better term would probably be "data 
provider"), which is called not before the contents of the field is actually 
needed - usually when encoding for the remote peer. It is the "p" "mode" flag 
in the datastore <map>s which controls the use of field proxies.

In the ODBC/SQL-backend, these proxies are configured with their own SQL 
statement which loads the field's data. In the plugin backend, which is used in 
SyncEvolution, the single-field-pull mechanism is mapped onto the 
ReadBLOB/WriteBLOB Api.

The proxy mechanism was even designed with the idea that really huge data 
should never be loaded as a block, but only streamed through the engine. 
However, that was never implemented on the encoding side, as the current SyncML 
item chunking mechanism is not ready for streamed generation (total size must 
be known in advance).
But for a SQL based server like our IOT server, it already helps a lot if 
contact images are NOT loaded as part of the syncset loading, but only when 
actually needed.

This is JFYI - for the problem at hand the READ() script solution is sure a 
clean and efficient way to go. 

Best Regards,

Lukas


On Jul 22, 2011, at 9:37 , Patrick Ohly wrote:

> Hello!
> 
> I'm currently working on https://bugs.meego.com/show_bug.cgi?id=19661:
> like N900/Maemo 5 before, MeeGo apps prefer to store URIs of local photo
> files in the PIM storage instead of storing the photo data in EDS
> (better for performance).
> 
> When such a contact is synced to a peer, the photo data must be included
> in the vCard. Ove solved that in his N900 port by inlining the data when
> reading from EDS, before handing the data to SyncEvolution/Synthesis.
> This has the downside that the data must be loaded in all cases,
> including those where it is not needed (slow sync comparison of the
> other properties in server mode) and remains in memory much longer.
> 
> I'd like to propose a more efficient solution that'll work with all
> backends. Ove, if this goes in, then you might be able to remove the
> special handling of photos in your port.
> 
> The idea is that a) the data model gets extended to allow both URI and
> data in PHOTO and b) a file URI gets replaced by the actual file content
> right before sending (but not sooner).
> 
> Lukas, can you review the libsynthesis changes? See below.  I pushed to
> the bmc19661 branch in meego.gitorious.org. Some other fixes are also
> included.
> 
> -----------------------
> $ git log --reverse -p 92d2f367..bmc19661
> 
> commit 01c6ff4f7136d2c72b520818ee1ba89dc53c71f0
> Author: Patrick Ohly <patrick.o...@intel.com>
> Date:   Fri Jul 22 08:12:05 2011 +0200
> 
>    SMLTK: fixed g++ 4.6 compiler warning
> 
>    g++ 4.6 warns that the "rc" variable is getting assigned but never
>    used. smlEndEvaluation() must have been meant to return that error
>    code instead of always returning SML_ERR_OK - fixed.
> 
> diff --git a/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c 
> b/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
> index ae040a4..601b530 100755
> --- a/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
> +++ b/src/syncml_tk/src/sml/mgr/all/mgrcmdbuilder.c
> @@ -698,7 +698,7 @@ SML_API Ret_t smlEndEvaluation(InstanceID_t id, MemSize_t 
> *freemem)
>     return SML_ERR_WRONG_USAGE;
> 
>   rc = xltEndEvaluation(id, (XltEncoderPtr_t)(pInstanceInfo->encoderState), 
> freemem);
> -  return SML_ERR_OK;
> +  return rc;
> }
> 
> #endif
> 
> commit 8d5cce896dcc5dba028d1cfa18f08e31adcc6e73
> Author: Patrick Ohly <patrick.o...@intel.com>
> Date:   Fri Jul 22 08:36:22 2011 +0200
> 
>    "blob" fields: avoid binary encoding if possible
> 
>    This change is meant for the PHOTO value, which can contain both
>    binary data and plain text URIs. Other binary data fields might also
>    benefit when their content turns out to be plain text (shorter
>    encoding).
> 
>    The change is that base64 encoding is not enforced if all characters
>    are ASCII and printable. That allows special characters like colon,
>    comma, and semicolon to appear unchanged in the content.
> 
>    Regardless whether the check succeeds, the result is guaranteed to
>    contain only ASCII characters, either because it only contains those
>    to start with or because of the base64 encoding.
> 
> diff --git a/src/sysync/mimedirprofile.cpp b/src/sysync/mimedirprofile.cpp
> index 4105d03..1499876 100644
> --- a/src/sysync/mimedirprofile.cpp
> +++ b/src/sysync/mimedirprofile.cpp
> @@ -23,6 +23,7 @@
> 
> #include "syncagent.h"
> 
> +#include <ctype.h>
> 
> using namespace sysync;
> 
> @@ -2274,8 +2275,18 @@ sInt16 TMimeDirProfileHandler::generateValue(
>           }
>           // append to existing string
>           fldP->appendToString(outval,maxSiz);
> -          // force B64 encoding
> -          aEncoding=enc_base64;
> +          // force B64 encoding if non-printable or non-ASCII characters
> +          // are in the value
> +          size_t len = outval.size();
> +          for (size_t i = 0; i < len; i++) {
> +            char c = outval[i];
> +            if (!isascii(c) || !isprint(c)) {
> +              aEncoding=enc_base64;
> +              break;
> +            }
> +          }
> +          // only ASCII in value: either because it contains only
> +          // those to start with or because they will be encoded
>           aNonASCII=false;
>         }
>         else {
> 
> commit b69d0aecf612d0f009903179619a983706f3b8f7
> Author: Patrick Ohly <patrick.o...@intel.com>
> Date:   Fri Jul 22 08:44:21 2011 +0200
> 
>    script error messages: fixed invalid memory access
> 
>    If the text goes through macro expansion, then "aScriptText" is not
>    the chunk of memory which holds the script and "text" doesn't point
>    into it anymore. Therefore "text-aScriptText" calculates the wrong
>    offset.
> 
>    Fixed by storing the real start of memory in a different variable
>    and using that instead of aScriptText.
> 
>    Found when enclosing a string with single quotes instead of double
>    quotes. The resulting syntax error message contained garbled
>    characters instead of the real script line.
> 
> diff --git a/src/sysync/scriptcontext.cpp b/src/sysync/scriptcontext.cpp
> index 35dff88..f21641e 100755
> --- a/src/sysync/scriptcontext.cpp
> +++ b/src/sysync/scriptcontext.cpp
> @@ -2464,6 +2464,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, 
> cAppCharP aScriptName, sI
>     text = itext.c_str();
>   }
>   // actual tokenisation
> +  cAppCharP textstart = text;
>   SYSYNC_TRY {
>     // process text
>     while (*text) {
> @@ -2540,7 +2541,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, 
> cAppCharP aScriptName, sI
>         else if (StrToEnum(ItemFieldTypeNames,numFieldTypes,enu,p,il)) {
>           // check if declaration and if allowed
>           if (aNoDeclarations && lasttoken!=TK_OPEN_PARANTHESIS)
> -            SYSYNC_THROW(TTokenizeException(aScriptName, "no local variable 
> declarations allowed in this script",aScriptText,text-aScriptText,line));
> +            SYSYNC_THROW(TTokenizeException(aScriptName, "no local variable 
> declarations allowed in this script",textstart,text-textstart,line));
>           // code type into token
>           aTScript+=TK_TYPEDEF; // token
>           aTScript+=1; // length of additional data
> @@ -2616,7 +2617,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, 
> cAppCharP aScriptName, sI
>           else if (strucmp(p,"WINNING",il)==0) objidx=OBJ_TARGET;
>           else if (strucmp(p,"TARGET",il)==0) objidx=OBJ_TARGET;
>           else
> -            SYSYNC_THROW(TTokenizeException(aScriptName,"unknown object 
> name",aScriptText,text-aScriptText,line));
> +            SYSYNC_THROW(TTokenizeException(aScriptName,"unknown object 
> name",textstart,text-textstart,line));
>           text++; // skip object qualifier
>           aTScript+=TK_OBJECT; // token
>           aTScript+=1; // length of additional data
> @@ -2641,13 +2642,13 @@ void TScriptContext::Tokenize(TSyncAppBase 
> *aAppBaseP, cAppCharP aScriptName, sI
>             p=text;
>             while (isidentchar(*text)) text++;
>             if (text==p)
> -              SYSYNC_THROW(TTokenizeException(aScriptName,"missing macro 
> name after $",aScriptText,text-aScriptText,line));
> +              SYSYNC_THROW(TTokenizeException(aScriptName,"missing macro 
> name after $",textstart,text-textstart,line));
>             itm.assign(p,text-p);
>             // see if we have such a macro
>             TScriptConfig *cfgP = aAppBaseP->getRootConfig()->fScriptConfigP;
>             TStringToStringMap::iterator pos = cfgP->fScriptMacros.find(itm);
>             if (pos==cfgP->fScriptMacros.end())
> -              SYSYNC_THROW(TTokenizeException(aScriptName,"unknown 
> macro",aScriptText,p-1-aScriptText,line));
> +              SYSYNC_THROW(TTokenizeException(aScriptName,"unknown 
> macro",textstart,p-1-textstart,line));
>             TMacroArgsArray macroArgs;
>             // check for macro arguments
>             if (*text=='(') {
> @@ -2772,7 +2773,7 @@ void TScriptContext::Tokenize(TSyncAppBase *aAppBaseP, 
> cAppCharP aScriptName, sI
>             else token=TK_BITWISEOR; // |
>             break;
>           default:
> -            SYSYNC_THROW(TTokenizeException(aScriptName,"Syntax 
> Error",aScriptText,text-aScriptText,line));
> +            SYSYNC_THROW(TTokenizeException(aScriptName,"Syntax 
> Error",textstart,text-textstart,line));
>         }
>       }
>       // add token if simple token found
> 
> commit e3fdd5ca811f24b2f80e598f9d00d2e134aa85e1
> Author: Patrick Ohly <patrick.o...@intel.com>
> Date:   Fri Jul 22 09:04:01 2011 +0200
> 
>    scripting: added READ() method
> 
>    The READ(filename) method returns the content of the file identified
>    with "filename". Relative paths are interpreted relative to the current
>    directory. On failures, an error messages is logged and UNASSIGNED
>    is returned.
> 
>    This method is useful for inlining the photo data referenced with
>    local file:// URIs shortly before sending to a remote peer. SyncEvolution
>    uses the method in its outgoing vcard script as follows:
> 
>    Field list:
>          <!-- Photo -->
>          <field name="PHOTO" type="blob" compare="never" merge="fillempty"/>
>          <field name="PHOTO_TYPE" type="string" compare="never" 
> merge="fillempty"/>
>          <field name="PHOTO_VALUE" type="string" compare="never" 
> merge="fillempty"/>
> 
>    Profile:
>            <property name="PHOTO" filter="no">
>              <value field="PHOTO" conversion="BLOB_B64"/>
>              <parameter name="TYPE" default="no" show="yes">
>                <value field="PHOTO_TYPE"/>
>              </parameter>
>              <parameter name="VALUE" default="no" show="yes">
>                <value field="PHOTO_VALUE"/>
>              </parameter>
>            </property>
> 
>    Script:
>          if (PHOTO_VALUE == "uri" &&
>              SUBSTR(PHOTO, 0, 7) == "file://") {
>              // inline the photo data
>              string data;
>              data = READ(SUBSTR(PHOTO, 7));
>              if (data != UNASSIGNED) {
>                  PHOTO = data;
>                  PHOTO_VALUE = "binary";
>              }
>          }
> 
>    Test cases for inlining, not inlining because of non-file URI and
>    failed inling (file not found) were added to SyncEvolution.
> 
> diff --git a/src/sysync/scriptcontext.cpp b/src/sysync/scriptcontext.cpp
> index f21641e..e6124c9 100755
> --- a/src/sysync/scriptcontext.cpp
> +++ b/src/sysync/scriptcontext.cpp
> @@ -27,6 +27,7 @@
>   #include "pcre.h" // for RegEx functions
> #endif
> 
> +#include <stdio.h>
> 
> // script debug messages
> #ifdef SYDEBUG
> @@ -869,6 +870,55 @@ public:
>     aTermP->setAsInteger(exitcode);
>   }; // func_Shellexecute
> 
> +  // string READ(string file)
> +  // reads the file and returns its content or UNASSIGNED in case of failure;
> +  // errors are logged
> +  static void func_Read(TItemField *&aTermP, TScriptContext *aFuncContextP)
> +  {
> +    // get params
> +    string file;
> +    aFuncContextP->getLocalVar(0)->getAsString(file);
> +
> +    // execute now
> +    string content;
> +    FILE *in;
> +    in = fopen(file.c_str(), "rb");
> +    if (in) {
> +      long size = fseek(in, 0, SEEK_END);
> +      if (size >= 0) {
> +        // managed to obtain size, use it to pre-allocate result
> +        content.reserve(size);
> +        fseek(in, 0, SEEK_SET);
> +      } else {
> +        // ignore seek error, might not be a plain file
> +        clearerr(in);
> +      }
> +
> +      if (!ferror(in)) {
> +        char buf[8192];
> +        size_t read;
> +        while ((read = fread(buf, 1, sizeof(buf), in)) > 0) {
> +          content.append(buf, read);
> +        }
> +      }
> +    }
> +
> +    if (in && !ferror(in)) {
> +      // return content as string
> +      aTermP->setAsString(content);
> +    } else {
> +        PLOGDEBUGPRINTFX(aFuncContextP->getDbgLogger(),
> +                       DBG_ERROR,(
> +                                  "IO error in READ(\"%s\"): %s ",
> +                                  file.c_str(),
> +                                  strerror(errno)));
> +    }
> +
> +    if (in) {
> +      fclose(in);
> +    }
> +  } // func_Read
> +
> 
>   // string REMOTERULENAME()
>   // returns name of the LAST matched remote rule (or subrule), empty if none
> @@ -2220,6 +2270,7 @@ const TBuiltInFuncDef BuiltInFuncDefs[] = {
>   { "REQUESTMAXTIME", TBuiltinStdFuncs::func_RequestMaxTime, fty_none, 1, 
> param_oneInteger },
>   { "REQUESTMINTIME", TBuiltinStdFuncs::func_RequestMinTime, fty_none, 1, 
> param_oneInteger },
>   { "SHELLEXECUTE", TBuiltinStdFuncs::func_Shellexecute, fty_integer, 3, 
> param_Shellexecute },
> +  { "READ",  TBuiltinStdFuncs::func_Read, fty_string, 1, param_oneString },
>   { "SESSIONVAR", TBuiltinStdFuncs::func_SessionVar, fty_none, 1, 
> param_oneString },
>   { "SETSESSIONVAR", TBuiltinStdFuncs::func_SetSessionVar, fty_none, 2, 
> param_SetSessionVar },
>   { "ABORTSESSION", TBuiltinStdFuncs::func_AbortSession, fty_none, 1, 
> param_oneInteger },
> 
> -----------------------
> 


_______________________________________________
os-libsynthesis mailing list
os-libsynthesis@synthesis.ch
http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis

Reply via email to