On Sat, 19 Oct 2013 10:56:02 +0100, Kagamin <s...@here.lot> wrote:

On Friday, 18 October 2013 at 10:44:11 UTC, Regan Heath wrote:
This comes up time and again. The use of, and ability to distinguish empty from null is very useful. Yes, you run the risk of things like null pointer exceptions etc, but we have that risk now without the reward of being able to distinguish these cases.

In C# code null strings are a plague.

I code in C# every day for work and I never have any problems with null strings. The conflated empty/null cases are the real nightmare for me (more below).

null strings are no different to null class references, they're not a special case. People seem to have this odd idea that null is somehow an invalid state for a string /reference/ (c# strings are reference types), it's not.

People also seem to elevate empty strings to some sort of special status, that's like saying 0 has some special status for int - it doesn't it's just one of a number of possible values.

In fact, int having no null like state is a "problem" causing solutions like boxing to elevate the value type to a reference in order to allow a null state for int.

Yet, in D we've decided to inconsistently remove that functionality from string for no gain. If string could not actually be null then we'd gain something from the limitation, instead we lose functionality and gain nothing - you still have to check your strings for null in D.

We ought to go one way or the other, this middle ground is worse than either of the other options.

In my code I don't have to check for or treat empty strings any differently to other values. I simply have to check for null. Remembering to check for null on reference types is automatic for me, strings are not special in this regard.

Most of the time you don't need them

Sure, and if I don't have access to null (like when using a value type like int), I can code around that lack, but it's never as straight forward a solution.

but still must check for them just in order to not get an exception.

Sure, you must check for the possible states of a reference type.

Also business logic makes no difference between null and empty

This is simply not true.  Example at the end.

both of them are just "no data", so you end up typing if(string.IsNullOrEmpty(mystr)) every time everywhere.

I only have to code like this when I use 3rd party code which has conflated empty and null. In my code when it's null it means not specified, and empty is just one type of value - for which I do no special handling.

And, yeah, only one small feature in this big mess ever needs to differentiate between null and empty.

Untrue, null allows many alternate and IMO more direct/obvious designs.

I found this one case trivially implementable, but nulls still plague all remaining code.

Which one case?  The readline() one below?

Take this simple design:

  string readline();

This function would like to be able to:
 - return null for EOF
 - return [] for a blank line

but it cannot, because as soon as you write:

  foo(readline())

the null/[] case merges.

This is a horrible design. You better throw an exception on eof instead of null:

No, no, no. You should only throw in exceptional circumstances or you risk using exceptions for flow control, and that is just plain horrid.

this null will break the caller anyway possibly in a contrived way.

Never a contrived way, always a blatantly obvious one and only if you're not doing your job properly. If you want a contrived, unpredictable and difficult to debug breakage look no further than heap or stack corruption. Null is never a difficult bug to find and fix, and is no different to forgetting to handle one of the integer return values of a function.

I use this all the time:
http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx

It has never caused me any issues. It explicitly states that null is a possible output, and so I check for it - doing anything less is simply bad programming.

It works if you read one line per loop cycle, but if you read several lines and assume they're not null (some multiline data format),

There is your problem, never "assume" - the documentation is very clear on the issue.

you're screwed or your code becomes littered with null checks, but who accounts for all alternative scenarios from the start?

Me, and IMO any competent programmer. It is misguided to think you can ignore valid states, null is a valid state in C, C++, C#, and D.. You should be thinking about and handling it.

You don't have to check for it on every access to the variable, but you do need to check for it once where the variable is assigned, or passed (in private functions you can skip this). From that point onward you can assume non-null, valid, job done.

There are plenty of other such design/cases that can be imagined, and while you can work around them all they add complexity for zero gain.

I believe there's no problem domain, which would like to differentiate between null and empty string instead of treating them as "no data".

null means not specified, non existent, was not there.
empty means, present but set to empty/blank.

Databases have this distinction for a reason.

If you get input from a user a field called "foo" may be:
 - not specified
 - specified

and if specified, may be:
 - empty
 - not empty

If foo is not specified you may want to assign a default value for it, if your business logic is using empty to mean "not specified" you prevent the user actually setting foo to empty and that limitation is a right pain in many cases.

You can code around this by using a boolean a dictionary to indicate the specified/not specified distinction, but this is less direct than simply using null.

If we have null, lets use it, if we want to remove null the lets remove it, but can we get out of this horrid middle ground please.

Regan

--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Reply via email to