std::vector is very slow ?
By Nico on Tuesday, July 26 2005, 18:02 - pytst - Permalink
I was trying to understand why my latest version of pytst (0.92) was 10% slower than the previous one (0.86). I have made three important changes between those two versions :
- I used my
PythonReferenceclass to ease the Python reference count management. I was fed up with having to managePy_INCREFandPy_DECREFmanually, so I decided to let the C++ compiler do the job itself, following the advice of Scott Meyers. - I made a complete overhaul of the
memory_storageclass, which is the class that, well, manages the storage of the TST nodes in memory (my long term plan is to provide afile_storageclass to be able to store and use the tree directly on the filesystem). I was feeling a bit silly to have implemented my own kind ofstd::vectorimplementation, so I wisely decided to usestd::vectorinstead. After all, why reinvent the wheel ? - The serialization system is now completely independent of the storage implementation. It's like a generic dump of the tree. This way, it will be possible to serialize a
tstinstance backed by amemory_storageand load it into atstinstance backed by a (not yet available)file_storage. Of course, this point is not related to the performance loss at all.
It turns out that when you use a profiler (I used the 15-days free evaluation of AQtime 4, which has everything I could dream of for a profiler, the first thing being that "it just works"), a lot of methods from std::vector are hotspots. Granted, this is in debug mode, so some code that should be inlined is not, but if a function is an hotspot when called, it has good chances of remaining an hotspot when inlined, don't you think ? Well, I've quickly reimplemented a std::vector-like behaviour right into memory_storage, and presto, I nearly get back my 10% percents of lost performance. Plus, the source code is not so much more complicated.
So, sometimes it's worth reinventing the wheel, especially for simple cases like that.
Anyway, those results are not especially surprising : the Microsoft implementation of std::vector::at (which was the most method that my code used the most) instantiates an iterator and uses the overloaded + operator, only to access a vector element ! I can't imagine how this could be more complicated. I don't know if this kind of convoluted way to do things is also found in other STL implementations, but it is a bit creepy...
Now, after the removal of std::vector, what's the top hotspot ? SWIG_Python_ConvertPtr ! Looks like SWIG will have to disappear... I'm really thinking about using PyCXX or Boost.Python.
It is very frustrating to have to fight against all those layers of foreign code, when all you want to do is optimize your own code ! Anyway, here comes pytst 0.93 !