I love performance testing and tuning. Comparing how a
product works under different conditions makes me giddy. So it’s no surprise I’ve been a busy little
beaver since Tableau 9 officially released.
Over the next couple weeks I have tons of stuff for you focusing on
Tableau Server 9 running on different hardware & process
configurations.
Introducing TabJolt
I did nearly all of my work with TabJolt, a new
“Point and Run” load generator created by the fine folks at Tableau. It was built to make it easy to drive load
against Tableau Server and is used for much of our internal QA work.
You can learn more about TabJolt by watching this video, and
you can download the bits here. A video which helps you set the tool up and use it can be found here.
I think of TabJolt as the point-and-click camera I bring on
vacation – it’s not a fancy DSLR with tons of options that only a professional
photographer knows how to use. If you need all those bells and whistles and are
willing to learn how to use them correctly, you’ll need to purchase a product
like HP LoadRunner. TabJolt is for the
rest of us – folks who just want to drive some basic load to make sure Tableau
can do what we need it do.
FYI, TabJolt is not supported by Tableau directly – you’ll
want to rely on the community for assistance like you do for most other open
source products.
Tabjolt can run two types of tests (here is a direct copy /
paste from the docs):
For the interaction mix (InteractVizLoadTest),
TabJolt selects the URLs for the views, which you have provided in vizpool.csv
file, based on a uniform distribution. Then it tries to load the view. After
the viz is loaded, TabJolt checks whether the viz has any elements that allow
interaction with the view (such as a slider bar, drop-down menu, and so on). If
the view has interaction elements, TabJolt performs those interactions without
requiring script development.If the
view doesn’t have any interactions, TabJolt selects marks on view.
The ViewVizLoadTest, on the other hand, simply loads the visualization
without doing any interactions.
So, the first test imitates users who run a report and then
actively click around and “play” with the viz after it is rendered. The second
test acts like the guy who comes in first thing in the morning, looks at his
dashboard to make sure everything is “green”, and then gets his coffee and
starts to work – no interaction. Clear?
TabJolt allows you generate load using as many vUsers
(concurrent users) as you wish. You can
also tell it how long to run a specific test.
When a test completes, you can explore the results using a set of
Tableau vizzes included with TabJolt.
You don’t want to install TabJolt on the same box Tableau Server.
You need a different machine to act as your load generator. I dropped TabJolt on an older box – a 4-core, single i5 CPU with 32 GB of RAM – While running tests I never saw
TabJolt use more than 25% of my CPU unless it was writing tons of text to the
console window – then it took 50%+ or so.
Summary: TabJolt is good stuff, and in the Tableau tradition
it allows you to answer all sorts of questions about Tableau Server for yourself, without needing to hire
someone to do the work for you. Combining TabJolt with AWS EC2 gives you a really awesome way to quickly iterate through various hardware AND software configurations – fast, and cheap.
Questions, questions…
The first question I wanted to answer for myself was:
If I’m going to run 16 cores of Tableau Server, what is the
best hardware configuration to use?
- Should I put all 16 cores on one machine?
- Should I split my 16 cores across two machines?
- How about splitting across 4 machines?
- Maybe 8 machines (NO! Just no. NO.)
And then I wondered:
- How many VizQLs should I run, and where should I put them?
- Should I put a data engine on each node, or isolate them?
I hear questions like the ones above all the time, and the
answer is always “It depends on your
workload”.
Fine, smarty-pants. I know my workloads, and now I have a simple tool to
help me test…I’m all set.
Hardware & Software
I chose to lean on AWS EC2 for my hardware. I love my EC2:
16 Core Machine:
- Model:
c3.8xlarge
- vCPU:
32
- Mem
(GiB):60
8 Core Machine(s):
- Model: c3.4xlarge
- vCPU: 16
- Mem (GiB): 30
4 Core Machine(s):
- Model: c3.2xlarge
- vCPU: 8
- Mem (GiB): 15
These all run Xeon E5-2680 processors,not top of the line E5 V3s.
This round of testing isn’t about trying to see “how fast will it go”. I just want to compare different configurations so I can make an informed decision about what architecture to use based on my workload.
I was also lazy and installed both the OS and Tableau to C: on each machine. I made the C: volume 60 GB and gave it 1500 IOPS via AWS EBS. (If what you just read doesn’t make sense yet, start here
FYI, I used Tableau Server v9 x64 across the board.
The Workload
I tested (mostly) with Tableau Sample vizzes as my workload. You’ll see lots of familiar faces below. I recorded the time it took to render each viz on the single 16 core machine from a dead stop. I actually wiped the external query cache on purpose to force Tableau to “start at the beginning” when these were rendered.
A few notes:
- If you see “Nope”, it means I didn’t actually directly use the viz in a load test – it is part of a larger dashboard, and that dashboard was executed
- The lovely pink “H” means this is part of my “heavy” workload, which we’ll take about in subsequent blog posts – as you can see, these take much longer to render as they are built to punish 🙂
- Some of the “heavy” workbooks were provided by customers. They’re therefore anonymized – no thumbnail for you!
Everything in the workload above is extract-based. This means I’m going to get fast response, but I’ll be putting my data engine (s) through a workout!
The Configurations
I tested five configurations:
1 x (16 Cores): I ran setup on this box and took all the default options: 2 VizQL processes, a single data engine, 2 cache servers, and all the other stuff that comes on Tableau. I didn’t bother to take a screenshot of this config as it is the “out of the box” experience. Probably should have taken a picture anyway. Sorry.
2 x (8 Cores) v1: 2 Vizqls & 2 Cache Servers on each box. A single data engine on Worker1. Note how didn’t bother even adding any Backgrounders or Data Servers. I thought “I don’t need these for the test”, so why bother? This was poor thinking, and I’ll explain why later.
2 x (8 Cores) v2: Same as above except I put data engines on both machines. I also learned from my mistakes with the 16 core and 8 core v1 tests and dropped a Backgrounder in even though I don’t really use it (or do I?!)
4 x (4 Cores) v1: (Generally) One of everything on each node: Repository on two nodes. Backgrounder on only one node.
4 x (4 Cores) v2: I try to get smart – isolate VizQLs and Data Engines on different nodes. I want my VizQLS to access as much of a machine’s resources as possible, so I put most of the “other” services on the data engine boxes (Workers 1 and 3).
The results? Don’t click the link below quite yet, but, they’re here:
I’m going to start in on post which explains this stuff. but wanted to get something up ASAP.