20: eina_tmpstr: add eina_tmpstr_strftime

Tom Hacohen Thu, 01 Oct 2015 07:07:52 -0700

Freeze is coming near. Should I revert the previous patches, or is it 
being taken care of?


--
Tom.


On 29/09/15 07:04, SHILPA ONKAR SINGH wrote:
>
>
> ------- Original Message -------
> Sender : Amitesh Singh<[email protected]>
> Date : Sep 29, 2015 02:03 (GMT+09:00)
> Title : Re: [E-devel] [EGIT] [core/efl] master 17/20: eina_tmpstr: add 
> eina_tmpstr_strftime
>
> Hi,
>
>
> On Mon, Sep 28, 2015 at 7:35 PM, Tom Hacohen wrote:
>
>> On 28/09/15 09:53, Tom Hacohen wrote:
>>> On 24/09/15 18:53, Cedric BAIL wrote:
>>>> On Thu, Sep 24, 2015 at 10:20 AM, Tom Hacohen
>> wrote:
>>>>> On 23/09/15 19:56, Cedric BAIL wrote:
>>>>>> On Wed, Sep 23, 2015 at 1:18 AM, Tom Hacohen
>> wrote:
>>>>>>> On 22/09/15 18:31, Cedric BAIL wrote:
>>>>>>>> Le 22 sept. 2015 09:40, "Tom Hacohen" a
>> écrit :
>>>>>>>>>
>>>>>>>>> On 22/09/15 17:32, Cedric BAIL wrote:
>>>>>>>>>> Le 22 sept. 2015 02:30, "Daniel Kolesa" a
>> écrit :
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 21, 2015 at 11:24 PM, Shilpa Singh <
>>>>>>>> [email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>>> cedric pushed a commit to branch master.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>> http://git.enlightenment.org/core/efl.git/commit/?id=abaf29cb768375957c9ee0b64d36034c21c618ea
>>>>>>>>>>>>
>>>>>>>>>>>> commit abaf29cb768375957c9ee0b64d36034c21c618ea
>>>>>>>>>>>> Author: Shilpa Singh
>>>>>>>>>>>> Date:   Mon Sep 21 23:48:16 2015 +0200
>>>>>>>>>>>>
>>>>>>>>>>>>           eina_tmpstr: add eina_tmpstr_strftime
>>>>>>>>>>>
>>>>>>>>>>> This API seems awfully arbitrary and I don't really see the
>> point.
>>>>>>>>>>> Might as well be any other function that prints to strings - are
>> you
>>>>>>>>>>> gonna add these too? Sounds horrible to me.
>>>>>>>>>>
>>>>>>>>>> Is it better to have our code cluttered by static buffer and no
>> overflow
>>>>>>>>>> check ? Obviously not. Yes,I expect we refactor all our buffer
>> print
>>>>>>>> code
>>>>>>>>>> as it is pretty bad right now and may even lead to security issue
>> in
>>>>>>>> some
>>>>>>>>>> case.
>>>>>>>>>
>>>>>>>>> Just to chime in, I think this doomsday scenario is a bit
>> exaggerated.
>>>>>>>>> I think having code like:
>>>>>>>>>
>>>>>>>>> char buf[SOME_LEN];
>>>>>>>>> strftime(...buf...);
>>>>>>>>> return tmpstr_add(buf);
>>>>>>>>>
>>>>>>>>> Is a much cleaner solution than adding all of the string functions
>> of
>>>>>>>>> the world to tmpstr, strbuf, stringshare and etc.
>>>>>>>>>
>>>>>>>>> I really don't like code duplication, and I think that's where
>> Daniel is
>>>>>>>>> coming from.
>>>>>>>>>
>>>>>>>>> What do you think?
>>>>>>>>
>>>>>>>> It's a good example that won't always work and increase at best the
>> stack
>>>>>>>> size requirement for absolutely no good reason.
>>>>>>>
>>>>>>> Why wouldn't it always work?
>>>>>>> Also if you enclose it in its own sub {}, every compiler will treat
>> the
>>>>>>> stack sanely, and even without it, I'd assume optimising compilers
>> will.
>>>>>>
>>>>>> The previous implementation used in Elementary was not working with
>>>>>> some local where it used way more charactere than expected. Work
>>>>>> around solution is to be over generous on memory...
>>>>>>
>>>>>> But compiler have no way to optimize it at all. How could they know
>>>>>> before jumping to a function call, that is going to generate data at
>>>>>> runtime, how much that function is going to use to limit the size of
>>>>>> the buffer on the stack they give to it. No compiler can optimize
>>>>>> that. It's just impossible. Every time we put a 4K buffer on the stack
>>>>>> all further function call will force that page allocation, that's how
>>>>>> things are.
>>>>>
>>>>> As implied by my comment about the own sub {}, I was talking about
>>>>> optimising the over-stack-usage.
>>>>>
>>>>> char buf[SOMELEN];
>>>>> strftime(buf);
>>>>> ret = tmpstr_add(buf);
>>>>> // buf is not used anymore, stack can be cleared.
>>>>>
>>>>> That's what I meant. It's true though that compilers can't assume we
>>>>> haven't kept the buf pointer and used it somewhere else (which we are
>>>>> allowed to until the function ended). Though that's not what you were
>>>>> talking about.
>>>>
>>>> That is another case that a compiler can not optimize. It has no clue
>>>> that buf is not stored somewhere globally, or that ret doesn't point
>>>> to somewhere in it. That's why stack size vary depending on the block
>>>> you are in. Also stack never shrink, so once you get those 4K on it,
>>>> they will stay there.
>>>
>>> As I said, this case is unlikely, but the other case I mentioned does
>>> allow the compiler to optimise:
>>>
>>> int foo(void)
>>> {
>>>       const char *bla;
>>>       {
>>>          char buf[LONGBUF];
>>>          strftime(...);
>>>          bla = eina_tmpstr_add;
>>>       }
>>>       // rest of code
>>>       printf("%s\n", bla);
>>>       return 0;
>>> }
>>>
>>> this can even be simplified into a macro that accepts a parameter saying
>>> which function to use to duplicate the string. strdup, eina_tmpstr_add,
>>> eina_stringshare_add or whatever.
>>>
>>>>
>>>>> However, looking at the tmpstr code, the solution there is no better.
>> It
>>>>> reallocs all the time and implements a whole mechanism there. Wouldn't
>>>>> it be much better to create a eina_strftime_alloc (bad name) that
>>>>> returns an allocated string of the right size and that can be used in
>>>>> strbuf (strbuf_string_steal), tmpstr (add a string steal) and etc?
>>>>> That's writing the function once, and having one API for all of eina,
>>>>> instead of the proposed method which will add plenty and duplicate all
>>>>> around.
>>>>
>>>> strftime doesn't tell you the size of the string you need. It only
>>>> tells you if it failed to put the generated string inside the buffer
>>>> you gave it. There is no way around doing a realloc loop with that
>>>> API.
>>>>
>>>> As for writing the function once, it is the case right now. If I was
>>>> to add that function to strbuf or stringshare, then I would obviously
>>>> find a way to share the code. If not by just calling the tmpstr
>>>> function. In general for Eina, we do not add a function until we have
>>>> at least 2 users in our code base. So for now, I do not plan to add
>>>> strftime to any other part of Eina, as we do not have the need for it.
>>>
>>> You are adding eina_tmpstr_add() without any users in the code base...
>>> Anyhow, I think it would be better to add API in a common place, as we
>>> know it'll be used more, than to add a specific API and then duplicating
>>> it to a common place.
>>>
>>>>
>>>>>>>> I see no reason to have bad code like that just for fewer line of
>> code.
>>>>>>>
>>>>>>> It's not a fewer lines of code, it's fewer API and a lot less lines
>> of
>>>>>>> code. Also, adding such API implies to developers that they should
>>>>>>> expect such API, so you are just feeding the evil loop.
>>>>>>
>>>>>> And they should use it, as the way they usually manually do it lead to
>>>>>> bug. This is actually reducing our code base and also reduce our
>>>>>> number of actual bugs.
>>>>>
>>>>> Have a general purpose function that does that (as above), no manual
>>>>> work. Still reduces the codebase, much cleaner.
>>>>
>>>> eina_strftime is I guess what you are trying to say here. Basically
>>>> not making it a tmpstr and just use free to get rid of it. I guess
>>>> that's an acceptable solution.
>>>
>>> Yes, it's exactly what I'm saying. Let's revert this patch and work on
>>> an eina_strftime instead.
>>>
>>>>
>>>>>>> Btw, I was thinking about the whole tmpstr solution (which you know
>> I am
>>>>>>> not a fan off) last night, and I came up with two solutions I think
>> are
>>>>>>> better, and anyhow, are massively more efficient.
>>>>>>> The first involves creating new API (but it's still less API than
>>>>>>> tmpstr) for genlist and etc that accept a struct instead of a string.
>>>>>>> This way we can add more info that will hint who should free it.
>>>>>>> This is the more involved, clean solution, that we should maybe adopt
>>>>>>> for EFL 2.0.
>>>>>>
>>>>>> So your solution is to pass a struct where there will be one byte
>>>>>> dedicated to say if that string need to be freed ? I think that's far
>>>>>> from elegant and fully error prone with no way for the compiler to
>>>>>> help on that.
>>>>>
>>>>> Maybe we'll have more metadata in the future. The struct will be
>> created
>>>>> automatically with an helper function, so not error prone at all.
>>>>
>>>> I will have to see it, but I have yet to see what would all those
>>>> metadata be. For the moment, I don't see how this could be an
>>>> improvement over current code base.
>>>
>>> I don't think eina_tmpstr is a good solution that will scale nicely,
>>> these are just alternative ideas.
>>>
>>>>
>>>>>>> The easy hacky solution, which is a replacement to tmpstr (which is
>> also
>>>>>>> a hack) and is much simpler and more efficient, is to use this
>> scheme:
>>>>>>> For static strings you just return them:
>>>>>>> return "foo";
>>>>>>> for dynamic you return this:
>>>>>>> return eina_tomstr_add("bar");
>>>>>>>
>>>>>>> You can free both with eina_tomstr_free(buf);
>>>>>>>
>>>>>>> The implementation is where the magic is at, in memory, it looks
>> like this:
>>>>>>> "\x02bar", the first char being the STX (start of text) char. We just
>>>>>>> use eina_tomstr_str_get(buf) whenever we want to use it. It just
>> returns
>>>>>>> "buf + 1" if the first char is STX. It's infinitely faster and more
>>>>>>> memory efficient than tmpstr, I think also much cleaner. The only bad
>>>>>>> thing is that you need to use the str function, but I think that's a
>>>>>>> reasonable compromise.
>>>>>>
>>>>>> That would obviously lead to random bug. If you free a buf that has
>>>>>> not been allocated by your tomstr, it will lead to an off by one
>>>>>> access, if you are unlucky you will fall on a non allocated page. If
>>>>>> you are really unlucky it will fall on an existing data that does have
>>>>>> the expected value and call free on a static buffer. Also it won't be
>>>>>> that much faster or slower as the current implementation walk a short
>>>>>> list that should definitively be only a few elements long.
>>>>>
>>>>> How so? As I said, genlist gets "\x02bar", so genlist can safely check
>>>>> the *first* (not -1) character of the string. It can then pass the
>>>>> result of string_get() to where it needs actual text, but it keeps the
>>>>> real reference for future processing. I don't see why or how we'll have
>>>>> text that starts with "\0x02". If we want to make it even more rare, we
>>>>> can use two bytes. Still much more efficient than tmpstr. Why would the
>>>>> the current implementation be a short list? You want this to be used
>> all
>>>>> around with dynamic strings, this list will grow as usage will grow.
>>>>
>>>> Oh, so you never let the API user know that it is indeed a string and
>>>> allways make it a hidden structure. That would make the code base
>>>> quite hugly with all those str_get arround for little benefit. Also
>>>> having the risk to have a confused API with random behavior sounds bad
>>>> to me. If you want to hide information in a less problematic way, you
>>>> can hide easily one bit in the pointer itself. As your API wouldn't
>>>> autorize the use of the string directly anyway, having the least
>>>> significant bit used to say if the pointer is to be freed or not will
>>>> be clearly less error prone. Still it's hugly as hell and I think
>>>> should be avoided in Efl code base (That's not the first time we have
>>>> ruled against using that trick in Eina code base, and I am still
>>>> against this kind of trick).
>>>
>>> I think the proposed mechanism is much safer than encoding in the
>>> pointer. Encoding in the pointer will result in bad memory access and
>>> thus a crash. My suggestion can cause some issues with comparison,
>>> that's it. You can create mechanisms that make it completely error free.
>>> For example, you could save it internally (when you do) as a type that
>>> can't be cast to a string, and then you'd have to use the functions
>>> otherwise it wouldn't compile. This is just a clean way to maintain the
>>> string ABI but still making it safe internally.
>>>
>>>>
>>>> As for why it should stay a short list it is due to what tmpstr is
>>>> used for. Code pattern for it is of the form :
>>>>
>>>> tmp = function_returning_tmpstr();
>>>> function_call(tmp);
>>>> tmpstr_del(tmp);
>>>>
>>>> So either we have a massive recursion (which is wrong and need to be
>>>> fixed), or we are leaking tmpstr. tmpstr is not intended to be used
>>>> for outside of a block and should not be in a structure for example.
>>>> As I said, tmpstr alive at any point in time should be pretty low and
>>>> walking that said list never be a performance issue.
>>>>
>>>
>>> True. If we only use it in genlist (and similar), this is the usage
>>> pattern and it won't be the end of the world. However, as I said, I
>>> proposed above a solution I find more clean and more general purpose.
>>> Can be used in many places and saved for the long term. This is a bit
>>> diverging from the purpose of this thread though (my bad). It's more
>>> important that we deal with the issue at hand before we freeze.
>>>
>>>
>>>
>>>
>>> Conclusion: let's add eina_strftime and eina_tmpstr_manage_new and
>>> revert this before the freeze.
>>
>> Btw, the rewrite should also fix the warnings.
>>
>> src/lib/elm_calendar.c:175:11: warning: return discards ‘const’
>> qualifier from pointer target type [-Wdiscarded-qualifiers]
>>       return eina_tmpstr_strftime(E_("%B %Y"), selected_time);
>>              ^
>>
>
> There was a quick patch to fix the warnings. But its incorrect. :/
> https://phab.enlightenment.org/D3113
>
> @tasn, @cedric
> I will add eina_strftime and eina_tmpstr_manage_new as discussed and 
> accordingly revert my patch
> https://phab.enlightenment.org/D3048
> will update: https://phab.enlightenment.org/D3087(example)
> and then update elm_calendar code accordingly.
> The above warning patch, I did not have a choice but to typecast due to 
> signature of format_func, but hopefully with above changes,
> I wont have to typecast I will update with corrected code.
> @tasn, sorry to use same email client, will change soon
>
>
>> --
>> Tom.
>>
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> enlightenment-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
>>
> ------------------------------------------------------------------------------
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> ------------------------------------------------------------------------------
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
>


------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] [EGIT] [core/efl] master 17/20: eina_tmpstr: add eina_tmpstr_strftime

Reply via email to