Discussion:
[SMARTY] Smarty and Unicode
Marcus Bointon
2006-02-28 17:18:17 UTC
Permalink
How does smarty deal with unicode, specifically UTF-8 templates and
passed string vars? Does it "just work", or are there steps I need to
take to make it happen?

Also does anyone have any links to articles on converting projects
from 8-bit encodings to unicode? I've not found anything that's much
good so far.

Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
***@synchromedia.co.uk | http://www.synchromedia.co.uk
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Monte Ohrt
2006-02-28 17:20:54 UTC
Permalink
I believe your PHP and HTML will drive how well UTF-8 works. Smarty
doesn't care what encoding is used. If your template is UTF-8, then be
sure your assigned template vars are UTF-8 too.
Post by Marcus Bointon
How does smarty deal with unicode, specifically UTF-8 templates and
passed string vars? Does it "just work", or are there steps I need to
take to make it happen?
Also does anyone have any links to articles on converting projects
from 8-bit encodings to unicode? I've not found anything that's much
good so far.
Marcus
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Marcus Bointon
2006-02-28 17:49:40 UTC
Permalink
My experience is that it just works, no need to do anything -I made
a BIG, BIG proyect on it- the main issue with unicode, it's not
smarty, it's PHP itself.
PHP has a terrible problem with it, since it's not supported
natively and the mb extension just sucks -in order to be useable it
should be a full replacement for all string related functions,
which is not
Yup, I've been keeping tabs on the progress of PHP 6, and I watched a
database system do the big switch a couple of years ago. Given the
lack of unicode safety in PHP, I would expect to run into problems
with things like modifier plugins - all it takes is a single
preg_replace or str_replace and it's going to break, so I was
wondering how you go about ensuring that doesn't happen.

Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
***@synchromedia.co.uk | http://www.synchromedia.co.uk
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Marcus Bointon
2006-02-28 18:07:32 UTC
Permalink
Regex don't even work, so if you're goin' to use regular
expressions, you're out of luck for non latin languages
So given that it apparently does work, is it therefore true that the
Smarty compiler and all standard plugins are regex-free?

Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
***@synchromedia.co.uk | http://www.synchromedia.co.uk
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
messju mohr
2006-02-28 19:47:20 UTC
Permalink
Post by Marcus Bointon
Regex don't even work, so if you're goin' to use regular
expressions, you're out of luck for non latin languages
So given that it apparently does work, is it therefore true that the
Smarty compiler and all standard plugins are regex-free?
no. first of all: all plugins that care about encoding (like
capitalize, lower or truncate) may not (or very likely won't) work.

regearding regexes: i don't see a problem why an ascii reges shouldn't
match on utf-input. if you stick to ascii with with your variable
names and identifiers, then utf-8 templates should be transparent to
smarty.
Post by Marcus Bointon
Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Vicente Werner
2006-03-01 08:12:44 UTC
Permalink
---------- Forwarded message ----------
From: Vicente Werner <***@gmail.com>
Date: Mar 1, 2006 9:06 AM
Subject: Re: [SMARTY] Smarty and Unicode
To: messju mohr <***@lammfellpuschen.de>

Regex's in PHP5.x or lower don't work correctly for UTF-8 strings, it does
match correctly ascii chars, but not UTF-8 ones (despite claiming support
for it..) it won't be ready and fully useable until PHP-6.
Post by messju mohr
Post by Marcus Bointon
Regex don't even work, so if you're goin' to use regular
expressions, you're out of luck for non latin languages
So given that it apparently does work, is it therefore true that the
Smarty compiler and all standard plugins are regex-free?
no. first of all: all plugins that care about encoding (like
capitalize, lower or truncate) may not (or very likely won't) work.
regearding regexes: i don't see a problem why an ascii reges shouldn't
match on utf-input. if you stick to ascii with with your variable
names and identifiers, then utf-8 templates should be transparent to
smarty.
Post by Marcus Bointon
Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
Vicente Werner y Sánchez


--
Vicente Werner y Sánchez
messju mohr
2006-03-01 08:28:21 UTC
Permalink
Post by Vicente Werner
---------- Forwarded message ----------
Date: Mar 1, 2006 9:06 AM
Subject: Re: [SMARTY] Smarty and Unicode
Regex's in PHP5.x or lower don't work correctly for UTF-8 strings, it does
match correctly ascii chars, but not UTF-8 ones (despite claiming support
for it..) it won't be ready and fully useable until PHP-6.
correct. but the smarty syntax is ascii and not utf-8. the regexes
correctly match the stuff inside {} and correctly ignore the stuff
outside {}.
Post by Vicente Werner
Post by messju mohr
Post by Marcus Bointon
Regex don't even work, so if you're goin' to use regular
expressions, you're out of luck for non latin languages
So given that it apparently does work, is it therefore true that the
Smarty compiler and all standard plugins are regex-free?
no. first of all: all plugins that care about encoding (like
capitalize, lower or truncate) may not (or very likely won't) work.
regearding regexes: i don't see a problem why an ascii reges shouldn't
match on utf-input. if you stick to ascii with with your variable
names and identifiers, then utf-8 templates should be transparent to
smarty.
Post by Marcus Bointon
Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
Vicente Werner y Sánchez
--
Vicente Werner y Sánchez
Vicente Werner
2006-03-01 08:53:29 UTC
Permalink
Yes, but what I was referring to in my email to Marcus was that if the
variable contents you need to modify were UTF-8 strings you might get very
strange results since php regex will treat that utf-8 string like an ASCII
one.

That issue has made me turn my back on PHP until version 6, where it'll be
solved -I hope, unless zend gets a new stupid idea again and decides to put
the effort in adding better support for .net, or anythin' like that-
Post by Vicente Werner
---------- Forwarded message ----------
Date: Mar 1, 2006 9:06 AM
Subject: Re: [SMARTY] Smarty and Unicode
Regex's in PHP5.x or lower don't work correctly for UTF-8 strings, it
does
Post by Vicente Werner
match correctly ascii chars, but not UTF-8 ones (despite claiming
support
Post by Vicente Werner
for it..) it won't be ready and fully useable until PHP-6.
correct. but the smarty syntax is ascii and not utf-8. the regexes
correctly match the stuff inside {} and correctly ignore the stuff
outside {}.
Post by Vicente Werner
Post by messju mohr
Post by Marcus Bointon
Regex don't even work, so if you're goin' to use regular
expressions, you're out of luck for non latin languages
So given that it apparently does work, is it therefore true that the
Smarty compiler and all standard plugins are regex-free?
no. first of all: all plugins that care about encoding (like
capitalize, lower or truncate) may not (or very likely won't) work.
regearding regexes: i don't see a problem why an ascii reges shouldn't
match on utf-input. if you stick to ascii with with your variable
names and identifiers, then utf-8 templates should be transparent to
smarty.
Post by Marcus Bointon
Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
Vicente Werner y Sánchez
--
Vicente Werner y Sánchez
--
Vicente Werner y Sánchez
Marcus Bointon
2006-03-01 15:43:16 UTC
Permalink
Post by Vicente Werner
Yes, but what I was referring to in my email to Marcus was that if the
variable contents you need to modify were UTF-8 strings you might get very
strange results since php regex will treat that utf-8 string like an ASCII
one.
I only really see a potential problem with modifiers. As Boots said,
as long as Smarty treats stuff inside and outside the delimiters
consistently, Unicode's ASCII compatibility should just let it work
transparently.

I've done some experiments and it does indeed 'just work' - getting
stuff in and out of the DB is similarly transparent (and yes I am
using mysql_real_escape_string!). It's safe to have UTF-8 templates,
though it seems sensible to keep scripts as iso-8859-1 (though it
also worked as UTF-8, including literal unicode text in strings).

I've no particular need to use UTF-8 variable or file names, so I'm
happy to simply avoid that issue.

It seems I 'only' need to pay attention to modifiers and validation
functions (strlen, trim and friends). It might be a good opportunity
to make my validation more modular so that when PHP 6 comes along I
can swap it all out easily.
Post by Vicente Werner
That issue has made me turn my back on PHP until version 6, where it'll be
solved -I hope, unless zend gets a new stupid idea again and
decides to put
the effort in adding better support for .net, or anythin' like that-
While I agree that PHP should probably have sorted this one out
earlier, I have a fairly large codebase and can't afford to switch
environments now. Plus there's too much other good stuff in PHP ;^)

Thanks for all your help,

Marcus
--
Marcus Bointon
Synchromedia Limited: Putting you in the picture
***@synchromedia.co.uk | http://www.synchromedia.co.uk
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Vicente Werner
2006-03-01 16:22:37 UTC
Permalink
Post by Marcus Bointon
I've done some experiments and it does indeed 'just work' - getting
stuff in and out of the DB is similarly transparent (and yes I am
using mysql_real_escape_string!). It's safe to have UTF-8 templates,
though it seems sensible to keep scripts as iso-8859-1 (though it
also worked as UTF-8, including literal unicode text in strings).
Well it's not that transparent, the only way to guarantee your stuff in an
out the db won't get corrupted is to transcode into entities in an out. My
main concern was the need to have everything utf-8 encoded so my scenario
might be very different than yours. For example dinahosting runs perfectly
in arab, hebrew -translations were made although they're not online yet-.


It seems I 'only' need to pay attention to modifiers and validation
Post by Marcus Bointon
functions (strlen, trim and friends). It might be a good opportunity
to make my validation more modular so that when PHP 6 comes along I
can swap it all out easily.
I made my own validation system, inspired by smaty_validate but with a
different system. It's fully transparent to the programmer .


While I agree that PHP should probably have sorted this one out
Post by Marcus Bointon
earlier, I have a fairly large codebase and can't afford to switch
environments now. Plus there's too much other good stuff in PHP ;^)
Yes, but the benefits of PHP for this particular scenario don't weight
against it's lackings.

--
Vicente Werner y Sánchez

boots
2006-02-28 20:44:08 UTC
Permalink
Post by Marcus Bointon
Regex don't even work, so if you're goin' to use regular
expressions, you're out of luck for non latin languages
So given that it apparently does work, is it therefore true that the
Smarty compiler and all standard plugins are regex-free?
I'm not following this discussion but that statement is fallacious -- Smarty is
very much built on regex parsing. perhaps the problem is defining "it works".
:)

xo boots

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
--
Smarty General Mailing List (http://smarty.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Loading...